diff --git a/Documentation/admin-guide/perf/hisi-pcie-pmu.rst b/Documentation/admin-guide/perf/hisi-pcie-pmu.rst new file mode 100644 index 0000000000000000000000000000000000000000..083ca50de896be5f4e1090812fe0e80e856d7c74 --- /dev/null +++ b/Documentation/admin-guide/perf/hisi-pcie-pmu.rst @@ -0,0 +1,148 @@ +================================================ +HiSilicon PCIe Performance Monitoring Unit (PMU) +================================================ + +On Hip09, HiSilicon PCIe Performance Monitoring Unit (PMU) could monitor +bandwidth, latency, bus utilization and buffer occupancy data of PCIe. + +Each PCIe Core has a PMU to monitor multi Root Ports of this PCIe Core and +all Endpoints downstream these Root Ports. + + +HiSilicon PCIe PMU driver +========================= + +The PCIe PMU driver registers a perf PMU with the name of its sicl-id and PCIe +Core id.:: + + /sys/bus/event_source/hisi_pcie_core + +PMU driver provides description of available events and filter options in sysfs, +see /sys/bus/event_source/devices/hisi_pcie_core. + +The "format" directory describes all formats of the config (events) and config1 +(filter options) fields of the perf_event_attr structure. The "events" directory +describes all documented events shown in perf list. + +The "identifier" sysfs file allows users to identify the version of the +PMU hardware device. + +The "bus" sysfs file allows users to get the bus number of Root Ports +monitored by PMU. Furthermore users can get the Root Ports range in +[bdf_min, bdf_max] from "bdf_min" and "bdf_max" sysfs attributes +respectively. + +Example usage of perf:: + + $# perf list + hisi_pcie0_core0/rx_mwr_latency/ [kernel PMU event] + hisi_pcie0_core0/rx_mwr_cnt/ [kernel PMU event] + ------------------------------------------ + + $# perf stat -e hisi_pcie0_core0/rx_mwr_latency,port=0xffff/ + $# perf stat -e hisi_pcie0_core0/rx_mwr_cnt,port=0xffff/ + +The related events usually used to calculate the bandwidth, latency or others. +They need to start and end counting at the same time, therefore related events +are best used in the same event group to get the expected value. There are two +ways to know if they are related events: + +a) By event name, such as the latency events "xxx_latency, xxx_cnt" or + bandwidth events "xxx_flux, xxx_time". +b) By event type, such as "event=0xXXXX, event=0x1XXXX". + +Example usage of perf group:: + + $# perf stat -e "{hisi_pcie0_core0/rx_mwr_latency,port=0xffff/,hisi_pcie0_core0/rx_mwr_cnt,port=0xffff/}" + +The current driver does not support sampling. So "perf record" is unsupported. +Also attach to a task is unsupported for PCIe PMU. + +Filter options +-------------- + +1. Target filter + + PMU could only monitor the performance of traffic downstream target Root + Ports or downstream target Endpoint. PCIe PMU driver support "port" and + "bdf" interfaces for users. + Please notice that, one of these two interfaces must be set, and these two + interfaces aren't supported at the same time. If they are both set, only + "port" filter is valid. + If "port" filter not being set or is set explicitly to zero (default), the + "bdf" filter will be in effect, because "bdf=0" meaning 0000:000:00.0. + + - port + + "port" filter can be used in all PCIe PMU events, target Root Port can be + selected by configuring the 16-bits-bitmap "port". Multi ports can be + selected for AP-layer-events, and only one port can be selected for + TL/DL-layer-events. + + For example, if target Root Port is 0000:00:00.0 (x8 lanes), bit0 of + bitmap should be set, port=0x1; if target Root Port is 0000:00:04.0 (x4 + lanes), bit8 is set, port=0x100; if these two Root Ports are both + monitored, port=0x101. + + Example usage of perf:: + + $# perf stat -e hisi_pcie0_core0/rx_mwr_latency,port=0x1/ sleep 5 + + - bdf + + "bdf" filter can only be used in bandwidth events, target Endpoint is + selected by configuring BDF to "bdf". Counter only counts the bandwidth of + message requested by target Endpoint. + + For example, "bdf=0x3900" means BDF of target Endpoint is 0000:39:00.0. + + Example usage of perf:: + + $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,bdf=0x3900/ sleep 5 + +2. Trigger filter + + Event statistics start when the first time TLP length is greater/smaller + than trigger condition. You can set the trigger condition by writing + "trig_len", and set the trigger mode by writing "trig_mode". This filter can + only be used in bandwidth events. + + For example, "trig_len=4" means trigger condition is 2^4 DW, "trig_mode=0" + means statistics start when TLP length > trigger condition, "trig_mode=1" + means start when TLP length < condition. + + Example usage of perf:: + + $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,port=0xffff,trig_len=0x4,trig_mode=1/ sleep 5 + +3. Threshold filter + + Counter counts when TLP length within the specified range. You can set the + threshold by writing "thr_len", and set the threshold mode by writing + "thr_mode". This filter can only be used in bandwidth events. + + For example, "thr_len=4" means threshold is 2^4 DW, "thr_mode=0" means + counter counts when TLP length >= threshold, and "thr_mode=1" means counts + when TLP length < threshold. + + Example usage of perf:: + + $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,port=0xffff,thr_len=0x4,thr_mode=1/ sleep 5 + +4. TLP Length filter + + When counting bandwidth, the data can be composed of certain parts of TLP + packets. You can specify it through "len_mode": + + - 2'b00: Reserved (Do not use this since the behaviour is undefined) + - 2'b01: Bandwidth of TLP payloads + - 2'b10: Bandwidth of TLP headers + - 2'b11: Bandwidth of both TLP payloads and headers + + For example, "len_mode=2" means only counting the bandwidth of TLP headers + and "len_mode=3" means the final bandwidth data is composed of both TLP + headers and payloads. Default value if not specified is 2'b11. + + Example usage of perf:: + + $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,port=0xffff,len_mode=0x1/ sleep 5 diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst index ee4bfd2a740f5da374d13a486b528f6058b011af..9d385055fd09900c3665614fa047bc9fd6572e28 100644 --- a/Documentation/admin-guide/perf/index.rst +++ b/Documentation/admin-guide/perf/index.rst @@ -8,6 +8,7 @@ Performance monitor support :maxdepth: 1 hisi-pmu + hisi-pcie-pmu qcom_l2_pmu qcom_l3_pmu arm-ccn diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst index 13f595e6fe180521dcfa5c56b49e4d23a8235f05..8a9a053889316416c13f5ec28ab2396bf35ce921 100644 --- a/Documentation/arm64/silicon-errata.rst +++ b/Documentation/arm64/silicon-errata.rst @@ -133,6 +133,10 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | Hisilicon | Hip08 SMMU PMCG | #162001800 | N/A | +----------------+-----------------+-----------------+-----------------------------+ +| Hisilicon | Hip{08,09,09A,10| #162001900 | N/A | +| | ,10C,11} | | | +| | SMMU PMCG | | | ++----------------+-----------------+-----------------+-----------------------------+ +----------------+-----------------+-----------------+-----------------------------+ | Qualcomm Tech. | Kryo/Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003 | +----------------+-----------------+-----------------+-----------------------------+ diff --git a/MAINTAINERS b/MAINTAINERS index dd2d56531d91cef4a88ec6a722174f556fcc477f..f70f52cc4b405863eafbb59267b42641ff355400 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7402,8 +7402,10 @@ F: Documentation/devicetree/bindings/net/hisilicon*.txt HISILICON PMU DRIVER M: Shaokun Zhang +M: Qi Liu W: http://www.hisilicon.com S: Supported +F: Documentation/admin-guide/perf/hisi-pcie-pmu.rst F: drivers/perf/hisilicon F: Documentation/admin-guide/perf/hisi-pmu.rst diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 19128d994ee978fd2653393c0c738a98b2ee67da..08c919cbfb6bb7803c1191c3fa4892c65a32bc01 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -344,8 +344,6 @@ static struct attribute_group armv8_pmuv3_format_attr_group = { */ #define ARMV8_IDX_CYCLE_COUNTER 0 #define ARMV8_IDX_COUNTER0 1 -#define ARMV8_IDX_COUNTER_LAST(cpu_pmu) \ - (ARMV8_IDX_CYCLE_COUNTER + cpu_pmu->num_events - 1) /* * We must chain two programmable counters for 64 bit events, @@ -389,12 +387,6 @@ static inline int armv8pmu_has_overflowed(u32 pmovsr) return pmovsr & ARMV8_PMU_OVERFLOWED_MASK; } -static inline int armv8pmu_counter_valid(struct arm_pmu *cpu_pmu, int idx) -{ - return idx >= ARMV8_IDX_CYCLE_COUNTER && - idx <= ARMV8_IDX_COUNTER_LAST(cpu_pmu); -} - static inline int armv8pmu_counter_has_overflowed(u32 pmnc, int idx) { return pmnc & BIT(ARMV8_IDX_TO_COUNTER(idx)); @@ -426,15 +418,11 @@ static inline u64 armv8pmu_read_hw_counter(struct perf_event *event) static u64 armv8pmu_read_counter(struct perf_event *event) { - struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); struct hw_perf_event *hwc = &event->hw; int idx = hwc->idx; u64 value = 0; - if (!armv8pmu_counter_valid(cpu_pmu, idx)) - pr_err("CPU%u reading wrong counter %d\n", - smp_processor_id(), idx); - else if (idx == ARMV8_IDX_CYCLE_COUNTER) + if (idx == ARMV8_IDX_CYCLE_COUNTER) value = read_sysreg(pmccntr_el0); else value = armv8pmu_read_hw_counter(event); @@ -463,14 +451,10 @@ static inline void armv8pmu_write_hw_counter(struct perf_event *event, static void armv8pmu_write_counter(struct perf_event *event, u64 value) { - struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); struct hw_perf_event *hwc = &event->hw; int idx = hwc->idx; - if (!armv8pmu_counter_valid(cpu_pmu, idx)) - pr_err("CPU%u writing wrong counter %d\n", - smp_processor_id(), idx); - else if (idx == ARMV8_IDX_CYCLE_COUNTER) { + if (idx == ARMV8_IDX_CYCLE_COUNTER) { /* * The cycles counter is really a 64-bit counter. * When treating it as a 32-bit counter, we only count diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c index bc95a5eebd137c86ead416813434f9ecd906fe72..425b2821aee0d49e70ac6c1e157531b6278c5be3 100644 --- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -1385,7 +1385,19 @@ static void __init arm_smmu_v3_pmcg_init_resources(struct resource *res, static struct acpi_platform_list pmcg_plat_info[] __initdata = { /* HiSilicon Hip08 Platform */ {"HISI ", "HIP08 ", 0, ACPI_SIG_IORT, greater_than_or_equal, - "Erratum #162001800", IORT_SMMU_V3_PMCG_HISI_HIP08}, + "Erratum #162001800, Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP08}, + /* HiSilicon Hip09 Platform */ + {"HISI ", "HIP09 ", 0, ACPI_SIG_IORT, greater_than_or_equal, + "Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09}, + {"HISI ", "HIP09A ", 0, ACPI_SIG_IORT, greater_than_or_equal, + "Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09}, + /* HiSilicon Hip10/11 Platform uses the same SMMU IP with Hip09 */ + {"HISI ", "HIP10 ", 0, ACPI_SIG_IORT, greater_than_or_equal, + "Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09}, + {"HISI ", "HIP10C ", 0, ACPI_SIG_IORT, greater_than_or_equal, + "Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09}, + {"HISI ", "HIP11 ", 0, ACPI_SIG_IORT, greater_than_or_equal, + "Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09}, { } }; diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index 09ae8a970880fde4fbba33da9a67de541353fb17..a9261cf48293b90ad5b423477d07db1e5e6835df 100644 --- a/drivers/perf/Kconfig +++ b/drivers/perf/Kconfig @@ -79,13 +79,6 @@ config FSL_IMX8_DDR_PMU can give information about memory throughput and other related events. -config HISI_PMU - bool "HiSilicon SoC PMU" - depends on ARM64 && ACPI - help - Support for HiSilicon SoC uncore performance monitoring - unit (PMU), such as: L3C, HHA and DDRC. - config QCOM_L2_PMU bool "Qualcomm Technologies L2-cache PMU" depends on ARCH_QCOM && ARM64 && ACPI @@ -129,4 +122,6 @@ config ARM_SPE_PMU Extension, which provides periodic sampling of operations in the CPU pipeline and reports this via the perf AUX interface. +source "drivers/perf/hisilicon/Kconfig" + endmenu diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c index 9cdd89b29334e61900ec88aec2dd88f0d0c846ca..bba35145d1430275d39e4d1e6f413521e7a1b65e 100644 --- a/drivers/perf/arm_smmuv3_pmu.c +++ b/drivers/perf/arm_smmuv3_pmu.c @@ -95,6 +95,7 @@ #define SMMU_PMCG_PA_SHIFT 12 #define SMMU_PMCG_EVCNTR_RDONLY BIT(0) +#define SMMU_PMCG_HARDEN_DISABLE BIT(1) static int cpuhp_state_num; @@ -138,6 +139,20 @@ static inline void smmu_pmu_enable(struct pmu *pmu) writel(SMMU_PMCG_CR_ENABLE, smmu_pmu->reg_base + SMMU_PMCG_CR); } +static int smmu_pmu_apply_event_filter(struct smmu_pmu *smmu_pmu, + struct perf_event *event, int idx); + +static inline void smmu_pmu_enable_quirk_hip08_09(struct pmu *pmu) +{ + struct smmu_pmu *smmu_pmu = to_smmu_pmu(pmu); + unsigned int idx; + + for_each_set_bit(idx, smmu_pmu->used_counters, smmu_pmu->num_counters) + smmu_pmu_apply_event_filter(smmu_pmu, smmu_pmu->events[idx], idx); + + smmu_pmu_enable(pmu); +} + static inline void smmu_pmu_disable(struct pmu *pmu) { struct smmu_pmu *smmu_pmu = to_smmu_pmu(pmu); @@ -146,6 +161,22 @@ static inline void smmu_pmu_disable(struct pmu *pmu) writel(0, smmu_pmu->reg_base + SMMU_PMCG_IRQ_CTRL); } +static inline void smmu_pmu_disable_quirk_hip08_09(struct pmu *pmu) +{ + struct smmu_pmu *smmu_pmu = to_smmu_pmu(pmu); + unsigned int idx; + + /* + * The global disable of PMU sometimes fail to stop the counting. + * Harden this by writing an invalid event type to each used counter + * to forcibly stop counting. + */ + for_each_set_bit(idx, smmu_pmu->used_counters, smmu_pmu->num_counters) + writel(0xffff, smmu_pmu->reg_base + SMMU_PMCG_EVTYPER(idx)); + + smmu_pmu_disable(pmu); +} + static inline void smmu_pmu_counter_set_value(struct smmu_pmu *smmu_pmu, u32 idx, u64 value) { @@ -717,7 +748,10 @@ static void smmu_pmu_get_acpi_options(struct smmu_pmu *smmu_pmu) switch (model) { case IORT_SMMU_V3_PMCG_HISI_HIP08: /* HiSilicon Erratum 162001800 */ - smmu_pmu->options |= SMMU_PMCG_EVCNTR_RDONLY; + smmu_pmu->options |= SMMU_PMCG_EVCNTR_RDONLY | SMMU_PMCG_HARDEN_DISABLE; + break; + case IORT_SMMU_V3_PMCG_HISI_HIP09: + smmu_pmu->options |= SMMU_PMCG_HARDEN_DISABLE; break; } @@ -806,6 +840,16 @@ static int smmu_pmu_probe(struct platform_device *pdev) smmu_pmu_get_acpi_options(smmu_pmu); + /* + * For platforms suffer this quirk, the PMU disable sometimes fails to + * stop the counters. This will leads to inaccurate or error counting. + * Forcibly disable the counters with these quirk handler. + */ + if (smmu_pmu->options & SMMU_PMCG_HARDEN_DISABLE) { + smmu_pmu->pmu.pmu_enable = smmu_pmu_enable_quirk_hip08_09; + smmu_pmu->pmu.pmu_disable = smmu_pmu_disable_quirk_hip08_09; + } + /* Pick one CPU to be the preferred one to use */ smmu_pmu->on_cpu = raw_smp_processor_id(); WARN_ON(irq_set_affinity_hint(smmu_pmu->irq, @@ -889,6 +933,7 @@ static void __exit arm_smmu_pmu_exit(void) module_exit(arm_smmu_pmu_exit); +MODULE_ALIAS("platform:arm-smmu-v3-pmcg"); MODULE_DESCRIPTION("PMU driver for ARM SMMUv3 Performance Monitors Extension"); MODULE_AUTHOR("Neil Leeder "); MODULE_AUTHOR("Shameer Kolothum "); diff --git a/drivers/perf/hisilicon/Kconfig b/drivers/perf/hisilicon/Kconfig new file mode 100644 index 0000000000000000000000000000000000000000..5546218b5598db757df2ed02b0fadefce279b5b5 --- /dev/null +++ b/drivers/perf/hisilicon/Kconfig @@ -0,0 +1,16 @@ +# SPDX-License-Identifier: GPL-2.0-only +config HISI_PMU + tristate "HiSilicon SoC PMU drivers" + depends on ARM64 && ACPI + help + Support for HiSilicon SoC L3 Cache performance monitor, Hydra Home + Agent performance monitor and DDR Controller performance monitor. + +config HISI_PCIE_PMU + tristate "HiSilicon PCIE PERF PMU" + depends on PCI && ARM64 + help + Provide support for HiSilicon PCIe performance monitoring unit (PMU) + RCiEP devices. + Adds the PCIe PMU into perf events system for monitoring latency, + bandwidth etc. diff --git a/drivers/perf/hisilicon/Makefile b/drivers/perf/hisilicon/Makefile index c3a96ec2bf66f9dc6ca888503f2b2cf6b7ec9654..63c558fcffb14714a466ba44fc5899fca16e6b1d 100644 --- a/drivers/perf/hisilicon/Makefile +++ b/drivers/perf/hisilicon/Makefile @@ -1,2 +1,5 @@ # SPDX-License-Identifier: GPL-2.0-only -obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o hisi_uncore_l3c_pmu.o hisi_uncore_hha_pmu.o hisi_uncore_ddrc_pmu.o +obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o hisi_uncore_l3c_pmu.o \ + hisi_uncore_hha_pmu.o hisi_uncore_ddrc_pmu.o + +obj-$(CONFIG_HISI_PCIE_PMU) += hisi_pcie_pmu.o diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c new file mode 100644 index 0000000000000000000000000000000000000000..438780ec81411e7e0e74f2c38666c48dc0bfee3b --- /dev/null +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c @@ -0,0 +1,1000 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * This driver adds support for PCIe PMU RCiEP device. Related + * perf events are bandwidth, latency etc. + * + * Copyright (C) 2021 HiSilicon Limited + * Author: Qi Liu + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define DRV_NAME "hisi_pcie_pmu" +/* Define registers */ +#define HISI_PCIE_GLOBAL_CTRL 0x00 +#define HISI_PCIE_EVENT_CTRL 0x010 +#define HISI_PCIE_CNT 0x090 +#define HISI_PCIE_EXT_CNT 0x110 +#define HISI_PCIE_INT_STAT 0x150 +#define HISI_PCIE_INT_MASK 0x154 +#define HISI_PCIE_REG_BDF 0xfe0 +#define HISI_PCIE_REG_VERSION 0xfe4 +#define HISI_PCIE_REG_INFO 0xfe8 + +/* Define command in HISI_PCIE_GLOBAL_CTRL */ +#define HISI_PCIE_GLOBAL_EN 0x01 +#define HISI_PCIE_GLOBAL_NONE 0 + +/* Define command in HISI_PCIE_EVENT_CTRL */ +#define HISI_PCIE_EVENT_EN BIT_ULL(20) +#define HISI_PCIE_RESET_CNT BIT_ULL(22) +#define HISI_PCIE_INIT_SET BIT_ULL(34) +#define HISI_PCIE_THR_EN BIT_ULL(26) +#define HISI_PCIE_TARGET_EN BIT_ULL(32) +#define HISI_PCIE_TRIG_EN BIT_ULL(52) + +/* Define offsets in HISI_PCIE_EVENT_CTRL */ +#define HISI_PCIE_EVENT_M GENMASK_ULL(15, 0) +#define HISI_PCIE_THR_MODE_M GENMASK_ULL(27, 27) +#define HISI_PCIE_THR_M GENMASK_ULL(31, 28) +#define HISI_PCIE_TARGET_M GENMASK_ULL(52, 36) +#define HISI_PCIE_TRIG_MODE_M GENMASK_ULL(53, 53) +#define HISI_PCIE_TRIG_M GENMASK_ULL(59, 56) + +#define HISI_PCIE_MAX_COUNTERS 8 +#define HISI_PCIE_REG_STEP 8 +#define HISI_PCIE_THR_MAX_VAL 10 +#define HISI_PCIE_TRIG_MAX_VAL 10 +#define HISI_PCIE_MAX_PERIOD (GENMASK_ULL(63, 0)) +#define HISI_PCIE_INIT_VAL BIT_ULL(63) + +struct hisi_pcie_pmu { + struct perf_event *hw_events[HISI_PCIE_MAX_COUNTERS]; + struct hlist_node node; + struct pci_dev *pdev; + struct pmu pmu; + void __iomem *base; + int irq; + u32 identifier; + /* Minimum and maximum BDF of root ports monitored by PMU */ + u16 bdf_min; + u16 bdf_max; + int on_cpu; +}; + +struct hisi_pcie_reg_pair { + u16 lo; + u16 hi; +}; + +#define to_pcie_pmu(p) (container_of((p), struct hisi_pcie_pmu, pmu)) +#define GET_PCI_DEVFN(bdf) ((bdf) & 0xff) + +#define HISI_PCIE_PMU_FILTER_ATTR(_name, _config, _hi, _lo) \ + static u64 hisi_pcie_get_##_name(struct perf_event *event) \ + { \ + return FIELD_GET(GENMASK(_hi, _lo), event->attr._config); \ + } \ + +HISI_PCIE_PMU_FILTER_ATTR(event, config, 16, 0); +HISI_PCIE_PMU_FILTER_ATTR(thr_len, config1, 3, 0); +HISI_PCIE_PMU_FILTER_ATTR(thr_mode, config1, 4, 4); +HISI_PCIE_PMU_FILTER_ATTR(trig_len, config1, 8, 5); +HISI_PCIE_PMU_FILTER_ATTR(trig_mode, config1, 9, 9); +HISI_PCIE_PMU_FILTER_ATTR(port, config2, 15, 0); +HISI_PCIE_PMU_FILTER_ATTR(bdf, config2, 31, 16); + +static ssize_t hisi_pcie_format_sysfs_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct dev_ext_attribute *eattr; + + eattr = container_of(attr, struct dev_ext_attribute, attr); + + return sysfs_emit(buf, "%s\n", (char *)eattr->var); +} + +static ssize_t hisi_pcie_event_sysfs_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct perf_pmu_events_attr *pmu_attr = + container_of(attr, struct perf_pmu_events_attr, attr); + + return sysfs_emit(buf, "config=0x%llx\n", pmu_attr->id); +} + +#define HISI_PCIE_PMU_FORMAT_ATTR(_name, _format) \ + (&((struct dev_ext_attribute[]){ \ + { .attr = __ATTR(_name, 0444, hisi_pcie_format_sysfs_show, \ + NULL), \ + .var = (void *)_format } \ + })[0].attr.attr) + +#define HISI_PCIE_PMU_EVENT_ATTR(_name, _id) \ + (&((struct perf_pmu_events_attr[]) { \ + { .attr = __ATTR(_name, 0444, hisi_pcie_event_sysfs_show, NULL), \ + .id = _id, } \ + })[0].attr.attr) + +static ssize_t cpumask_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(dev_get_drvdata(dev)); + + return cpumap_print_to_pagebuf(true, buf, cpumask_of(pcie_pmu->on_cpu)); +} +static DEVICE_ATTR_RO(cpumask); + +static ssize_t identifier_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(dev_get_drvdata(dev)); + + return sysfs_emit(buf, "%#x\n", pcie_pmu->identifier); +} +static DEVICE_ATTR_RO(identifier); + +static ssize_t bus_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(dev_get_drvdata(dev)); + + return sysfs_emit(buf, "%#04x\n", PCI_BUS_NUM(pcie_pmu->bdf_min)); +} +static DEVICE_ATTR_RO(bus); + +static ssize_t bdf_min_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(dev_get_drvdata(dev)); + + return sysfs_emit(buf, "%#04x\n", pcie_pmu->bdf_min); +} +static DEVICE_ATTR_RO(bdf_min); + +static ssize_t bdf_max_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(dev_get_drvdata(dev)); + + return sysfs_emit(buf, "%#04x\n", pcie_pmu->bdf_max); +} +static DEVICE_ATTR_RO(bdf_max); + +static struct hisi_pcie_reg_pair +hisi_pcie_parse_reg_value(struct hisi_pcie_pmu *pcie_pmu, u32 reg_off) +{ + u32 val = readl_relaxed(pcie_pmu->base + reg_off); + struct hisi_pcie_reg_pair regs = { + .lo = val, + .hi = val >> 16, + }; + + return regs; +} + +/* + * Hardware counter and ext_counter work together for bandwidth, latency, bus + * utilization and buffer occupancy events. For example, RX memory write latency + * events(index = 0x0010), counter counts total delay cycles and ext_counter + * counts RX memory write PCIe packets number. + * + * As we don't want PMU driver to process these two data, "delay cycles" can + * be treated as an independent event(index = 0x0010), "RX memory write packets + * number" as another(index = 0x10010). BIT 16 is used to distinguish and 0-15 + * bits are "real" event index, which can be used to set HISI_PCIE_EVENT_CTRL. + */ +#define EXT_COUNTER_IS_USED(idx) ((idx) & BIT(16)) + +static u32 hisi_pcie_get_real_event(struct perf_event *event) +{ + return hisi_pcie_get_event(event) & GENMASK(15, 0); +} + +static u32 hisi_pcie_pmu_get_offset(u32 offset, u32 idx) +{ + return offset + HISI_PCIE_REG_STEP * idx; +} + +static u32 hisi_pcie_pmu_readl(struct hisi_pcie_pmu *pcie_pmu, u32 reg_offset, + u32 idx) +{ + u32 offset = hisi_pcie_pmu_get_offset(reg_offset, idx); + + return readl_relaxed(pcie_pmu->base + offset); +} + +static void hisi_pcie_pmu_writel(struct hisi_pcie_pmu *pcie_pmu, u32 reg_offset, u32 idx, u32 val) +{ + u32 offset = hisi_pcie_pmu_get_offset(reg_offset, idx); + + writel_relaxed(val, pcie_pmu->base + offset); +} + +static u64 hisi_pcie_pmu_readq(struct hisi_pcie_pmu *pcie_pmu, u32 reg_offset, u32 idx) +{ + u32 offset = hisi_pcie_pmu_get_offset(reg_offset, idx); + + return readq_relaxed(pcie_pmu->base + offset); +} + +static void hisi_pcie_pmu_writeq(struct hisi_pcie_pmu *pcie_pmu, u32 reg_offset, u32 idx, u64 val) +{ + u32 offset = hisi_pcie_pmu_get_offset(reg_offset, idx); + + writeq_relaxed(val, pcie_pmu->base + offset); +} + +static u64 hisi_pcie_pmu_get_event_ctrl_val(struct perf_event *event) +{ + u64 reg = 0; + u64 port, trig_len, thr_len; + + /* Config HISI_PCIE_EVENT_CTRL according to event. */ + reg |= FIELD_PREP(HISI_PCIE_EVENT_M, hisi_pcie_get_real_event(event)); + + /* Config HISI_PCIE_EVENT_CTRL according to root port or EP device. */ + port = hisi_pcie_get_port(event); + if (port) + reg |= FIELD_PREP(HISI_PCIE_TARGET_M, port); + else + reg |= HISI_PCIE_TARGET_EN | + FIELD_PREP(HISI_PCIE_TARGET_M, hisi_pcie_get_bdf(event)); + + /* Config HISI_PCIE_EVENT_CTRL according to trigger condition. */ + trig_len = hisi_pcie_get_trig_len(event); + if (trig_len) { + reg |= FIELD_PREP(HISI_PCIE_TRIG_M, trig_len); + reg |= FIELD_PREP(HISI_PCIE_TRIG_MODE_M, hisi_pcie_get_trig_mode(event)); + reg |= HISI_PCIE_TRIG_EN; + } + + /* Config HISI_PCIE_EVENT_CTRL according to threshold condition. */ + thr_len = hisi_pcie_get_thr_len(event); + if (thr_len) { + reg |= FIELD_PREP(HISI_PCIE_THR_M, thr_len); + reg |= FIELD_PREP(HISI_PCIE_THR_MODE_M, hisi_pcie_get_thr_mode(event)); + reg |= HISI_PCIE_THR_EN; + } + + return reg; +} + +static void hisi_pcie_pmu_config_event_ctrl(struct perf_event *event) +{ + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu); + struct hw_perf_event *hwc = &event->hw; + u64 reg = hisi_pcie_pmu_get_event_ctrl_val(event); + + hisi_pcie_pmu_writeq(pcie_pmu, HISI_PCIE_EVENT_CTRL, hwc->idx, reg); +} + +static void hisi_pcie_pmu_clear_event_ctrl(struct perf_event *event) +{ + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu); + struct hw_perf_event *hwc = &event->hw; + + hisi_pcie_pmu_writeq(pcie_pmu, HISI_PCIE_EVENT_CTRL, hwc->idx, HISI_PCIE_INIT_SET); +} + +static bool hisi_pcie_pmu_valid_requester_id(struct hisi_pcie_pmu *pcie_pmu, u32 bdf) +{ + struct pci_dev *root_port, *pdev; + u16 rp_bdf; + + pdev = pci_get_domain_bus_and_slot(pci_domain_nr(pcie_pmu->pdev->bus), PCI_BUS_NUM(bdf), + GET_PCI_DEVFN(bdf)); + if (!pdev) + return false; + + root_port = pcie_find_root_port(pdev); + if (!root_port) { + pci_dev_put(pdev); + return false; + } + + pci_dev_put(pdev); + rp_bdf = pci_dev_id(root_port); + return rp_bdf >= pcie_pmu->bdf_min && rp_bdf <= pcie_pmu->bdf_max; +} + +static bool hisi_pcie_pmu_valid_filter(struct perf_event *event, + struct hisi_pcie_pmu *pcie_pmu) +{ + u32 requester_id = hisi_pcie_get_bdf(event); + + if (hisi_pcie_get_thr_len(event) > HISI_PCIE_THR_MAX_VAL) + return false; + + if (hisi_pcie_get_trig_len(event) > HISI_PCIE_TRIG_MAX_VAL) + return false; + + /* Need to explicitly set filter of "port" or "bdf" */ + if (!hisi_pcie_get_port(event) && + !hisi_pcie_pmu_valid_requester_id(pcie_pmu, requester_id)) + return false; + + return true; +} + +/* + * Check Whether two events share the same config. The same config means not + * only the event code, but also the filter settings of the two events are + * the same. + */ +static bool hisi_pcie_pmu_cmp_event(struct perf_event *target, + struct perf_event *event) +{ + return hisi_pcie_pmu_get_event_ctrl_val(target) == + hisi_pcie_pmu_get_event_ctrl_val(event); +} + +static bool hisi_pcie_pmu_validate_event_group(struct perf_event *event) +{ + struct perf_event *sibling, *leader = event->group_leader; + struct perf_event *event_group[HISI_PCIE_MAX_COUNTERS]; + int counters = 1; + int num; + + event_group[0] = leader; + if (!is_software_event(leader)) { + if (leader->pmu != event->pmu) + return false; + + if (leader != event && !hisi_pcie_pmu_cmp_event(leader, event)) + event_group[counters++] = event; + } + + for_each_sibling_event(sibling, event->group_leader) { + if (is_software_event(sibling)) + continue; + + if (sibling->pmu != event->pmu) + return false; + + for (num = 0; num < counters; num++) { + /* + * If we find a related event, then it's a valid group + * since we don't need to allocate a new counter for it. + */ + if (hisi_pcie_pmu_cmp_event(event_group[num], sibling)) + break; + } + + /* + * Otherwise it's a new event but if there's no available counter, + * fail the check since we cannot schedule all the events in + * the group simultaneously. + */ + if (num == HISI_PCIE_MAX_COUNTERS) + return false; + + if (num == counters) + event_group[counters++] = sibling; + } + + return true; +} + +static int hisi_pcie_pmu_event_init(struct perf_event *event) +{ + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu); + struct hw_perf_event *hwc = &event->hw; + + /* Check the type first before going on, otherwise it's not our event */ + if (event->attr.type != event->pmu->type) + return -ENOENT; + + if (EXT_COUNTER_IS_USED(hisi_pcie_get_event(event))) + hwc->event_base = HISI_PCIE_EXT_CNT; + else + hwc->event_base = HISI_PCIE_CNT; + + /* Sampling is not supported. */ + if (is_sampling_event(event) || event->attach_state & PERF_ATTACH_TASK) + return -EOPNOTSUPP; + + if (!hisi_pcie_pmu_valid_filter(event, pcie_pmu)) + return -EINVAL; + + if (!hisi_pcie_pmu_validate_event_group(event)) + return -EINVAL; + + event->cpu = pcie_pmu->on_cpu; + + return 0; +} + +static u64 hisi_pcie_pmu_read_counter(struct perf_event *event) +{ + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu); + u32 idx = event->hw.idx; + + return hisi_pcie_pmu_readq(pcie_pmu, event->hw.event_base, idx); +} + +/* + * Check all work events, if a relevant event is found then we return it + * first, otherwise return the first idle counter (need to reset). + */ +static int hisi_pcie_pmu_get_event_idx(struct hisi_pcie_pmu *pcie_pmu, + struct perf_event *event) +{ + int first_idle = -EAGAIN; + struct perf_event *sibling; + int idx; + + for (idx = 0; idx < HISI_PCIE_MAX_COUNTERS; idx++) { + sibling = pcie_pmu->hw_events[idx]; + if (!sibling) { + if (first_idle == -EAGAIN) + first_idle = idx; + continue; + } + + /* Related events must be used in group */ + if (hisi_pcie_pmu_cmp_event(sibling, event) && + sibling->group_leader == event->group_leader) + return idx; + } + + return first_idle; +} + +static void hisi_pcie_pmu_event_update(struct perf_event *event) +{ + struct hw_perf_event *hwc = &event->hw; + u64 new_cnt, prev_cnt, delta; + + do { + prev_cnt = local64_read(&hwc->prev_count); + new_cnt = hisi_pcie_pmu_read_counter(event); + } while (local64_cmpxchg(&hwc->prev_count, prev_cnt, + new_cnt) != prev_cnt); + + delta = (new_cnt - prev_cnt) & HISI_PCIE_MAX_PERIOD; + local64_add(delta, &event->count); +} + +static void hisi_pcie_pmu_read(struct perf_event *event) +{ + hisi_pcie_pmu_event_update(event); +} + +static void hisi_pcie_pmu_set_period(struct perf_event *event) +{ + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu); + struct hw_perf_event *hwc = &event->hw; + int idx = hwc->idx; + u64 orig_cnt, cnt; + + orig_cnt = hisi_pcie_pmu_read_counter(event); + + local64_set(&hwc->prev_count, HISI_PCIE_INIT_VAL); + hisi_pcie_pmu_writeq(pcie_pmu, HISI_PCIE_CNT, idx, HISI_PCIE_INIT_VAL); + hisi_pcie_pmu_writeq(pcie_pmu, HISI_PCIE_EXT_CNT, idx, HISI_PCIE_INIT_VAL); + + /* + * The counter maybe unwritable if the target event is unsupported. + * Check this by comparing the counts after setting the period. If + * the counts stay unchanged after setting the period then update + * the hwc->prev_count correctly. Otherwise the final counts user + * get maybe totally wrong. + */ + cnt = hisi_pcie_pmu_read_counter(event); + if (orig_cnt == cnt) + local64_set(&hwc->prev_count, cnt); +} + +static void hisi_pcie_pmu_enable_counter(struct hisi_pcie_pmu *pcie_pmu, struct hw_perf_event *hwc) +{ + u32 idx = hwc->idx; + u64 val; + + val = hisi_pcie_pmu_readq(pcie_pmu, HISI_PCIE_EVENT_CTRL, idx); + val |= HISI_PCIE_EVENT_EN; + hisi_pcie_pmu_writeq(pcie_pmu, HISI_PCIE_EVENT_CTRL, idx, val); +} + +static void hisi_pcie_pmu_disable_counter(struct hisi_pcie_pmu *pcie_pmu, struct hw_perf_event *hwc) +{ + u32 idx = hwc->idx; + u64 val; + + val = hisi_pcie_pmu_readq(pcie_pmu, HISI_PCIE_EVENT_CTRL, idx); + val &= ~HISI_PCIE_EVENT_EN; + hisi_pcie_pmu_writeq(pcie_pmu, HISI_PCIE_EVENT_CTRL, idx, val); +} + +static void hisi_pcie_pmu_enable_int(struct hisi_pcie_pmu *pcie_pmu, struct hw_perf_event *hwc) +{ + u32 idx = hwc->idx; + + hisi_pcie_pmu_writel(pcie_pmu, HISI_PCIE_INT_MASK, idx, 0); +} + +static void hisi_pcie_pmu_disable_int(struct hisi_pcie_pmu *pcie_pmu, struct hw_perf_event *hwc) +{ + u32 idx = hwc->idx; + + hisi_pcie_pmu_writel(pcie_pmu, HISI_PCIE_INT_MASK, idx, 1); +} + +static void hisi_pcie_pmu_reset_counter(struct hisi_pcie_pmu *pcie_pmu, int idx) +{ + hisi_pcie_pmu_writeq(pcie_pmu, HISI_PCIE_EVENT_CTRL, idx, HISI_PCIE_RESET_CNT); + hisi_pcie_pmu_writeq(pcie_pmu, HISI_PCIE_EVENT_CTRL, idx, HISI_PCIE_INIT_SET); +} + +static void hisi_pcie_pmu_start(struct perf_event *event, int flags) +{ + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu); + struct hw_perf_event *hwc = &event->hw; + int idx = hwc->idx; + u64 prev_cnt; + + if (WARN_ON_ONCE(!(hwc->state & PERF_HES_STOPPED))) + return; + + WARN_ON_ONCE(!(hwc->state & PERF_HES_UPTODATE)); + hwc->state = 0; + + hisi_pcie_pmu_config_event_ctrl(event); + hisi_pcie_pmu_enable_counter(pcie_pmu, hwc); + hisi_pcie_pmu_enable_int(pcie_pmu, hwc); + hisi_pcie_pmu_set_period(event); + + if (flags & PERF_EF_RELOAD) { + prev_cnt = local64_read(&hwc->prev_count); + hisi_pcie_pmu_writeq(pcie_pmu, hwc->event_base, idx, prev_cnt); + } + + perf_event_update_userpage(event); +} + +static void hisi_pcie_pmu_stop(struct perf_event *event, int flags) +{ + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu); + struct hw_perf_event *hwc = &event->hw; + + hisi_pcie_pmu_event_update(event); + hisi_pcie_pmu_disable_int(pcie_pmu, hwc); + hisi_pcie_pmu_disable_counter(pcie_pmu, hwc); + hisi_pcie_pmu_clear_event_ctrl(event); + WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED); + hwc->state |= PERF_HES_STOPPED; + + if (hwc->state & PERF_HES_UPTODATE) + return; + + hwc->state |= PERF_HES_UPTODATE; +} + +static int hisi_pcie_pmu_add(struct perf_event *event, int flags) +{ + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu); + struct hw_perf_event *hwc = &event->hw; + int idx; + + hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE; + + idx = hisi_pcie_pmu_get_event_idx(pcie_pmu, event); + if (idx < 0) + return idx; + + hwc->idx = idx; + + /* No enabled counter found with related event, reset it */ + if (!pcie_pmu->hw_events[idx]) { + hisi_pcie_pmu_reset_counter(pcie_pmu, idx); + pcie_pmu->hw_events[idx] = event; + } + + if (flags & PERF_EF_START) + hisi_pcie_pmu_start(event, PERF_EF_RELOAD); + + return 0; +} + +static void hisi_pcie_pmu_del(struct perf_event *event, int flags) +{ + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu); + struct hw_perf_event *hwc = &event->hw; + + hisi_pcie_pmu_stop(event, PERF_EF_UPDATE); + pcie_pmu->hw_events[hwc->idx] = NULL; + perf_event_update_userpage(event); +} + +static void hisi_pcie_pmu_enable(struct pmu *pmu) +{ + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(pmu); + int num; + + for (num = 0; num < HISI_PCIE_MAX_COUNTERS; num++) { + if (pcie_pmu->hw_events[num]) + break; + } + + if (num == HISI_PCIE_MAX_COUNTERS) + return; + + writel(HISI_PCIE_GLOBAL_EN, pcie_pmu->base + HISI_PCIE_GLOBAL_CTRL); +} + +static void hisi_pcie_pmu_disable(struct pmu *pmu) +{ + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(pmu); + + writel(HISI_PCIE_GLOBAL_NONE, pcie_pmu->base + HISI_PCIE_GLOBAL_CTRL); +} + +static irqreturn_t hisi_pcie_pmu_irq(int irq, void *data) +{ + struct hisi_pcie_pmu *pcie_pmu = data; + irqreturn_t ret = IRQ_NONE; + struct perf_event *event; + u32 overflown; + int idx; + + for (idx = 0; idx < HISI_PCIE_MAX_COUNTERS; idx++) { + overflown = hisi_pcie_pmu_readl(pcie_pmu, HISI_PCIE_INT_STAT, idx); + if (!overflown) + continue; + + /* Clear status of interrupt. */ + hisi_pcie_pmu_writel(pcie_pmu, HISI_PCIE_INT_STAT, idx, 1); + event = pcie_pmu->hw_events[idx]; + if (!event) + continue; + + hisi_pcie_pmu_event_update(event); + hisi_pcie_pmu_set_period(event); + ret = IRQ_HANDLED; + } + + return ret; +} + +static int hisi_pcie_pmu_irq_register(struct pci_dev *pdev, struct hisi_pcie_pmu *pcie_pmu) +{ + int irq, ret; + + ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_MSI); + if (ret < 0) { + pci_err(pdev, "Failed to enable MSI vectors: %d\n", ret); + return ret; + } + + irq = pci_irq_vector(pdev, 0); + ret = request_irq(irq, hisi_pcie_pmu_irq, IRQF_NOBALANCING | IRQF_NO_THREAD, DRV_NAME, + pcie_pmu); + if (ret) { + pci_err(pdev, "Failed to register IRQ: %d\n", ret); + pci_free_irq_vectors(pdev); + return ret; + } + + pcie_pmu->irq = irq; + + return 0; +} + +static void hisi_pcie_pmu_irq_unregister(struct pci_dev *pdev, struct hisi_pcie_pmu *pcie_pmu) +{ + free_irq(pcie_pmu->irq, pcie_pmu); + pci_free_irq_vectors(pdev); +} + +static int hisi_pcie_pmu_online_cpu(unsigned int cpu, struct hlist_node *node) +{ + struct hisi_pcie_pmu *pcie_pmu = hlist_entry_safe(node, struct hisi_pcie_pmu, node); + + if (pcie_pmu->on_cpu == -1) { + pcie_pmu->on_cpu = cpu; + WARN_ON(irq_set_affinity(pcie_pmu->irq, cpumask_of(cpu))); + } + + return 0; +} + +static int hisi_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node) +{ + struct hisi_pcie_pmu *pcie_pmu = hlist_entry_safe(node, struct hisi_pcie_pmu, node); + unsigned int target; + + /* Nothing to do if this CPU doesn't own the PMU */ + if (pcie_pmu->on_cpu != cpu) + return 0; + + pcie_pmu->on_cpu = -1; + /* Choose a new CPU from all online cpus. */ + target = cpumask_any_but(cpu_online_mask, cpu); + if (target >= nr_cpu_ids) { + pci_err(pcie_pmu->pdev, "There is no CPU to set\n"); + return 0; + } + + perf_pmu_migrate_context(&pcie_pmu->pmu, cpu, target); + /* Use this CPU for event counting */ + pcie_pmu->on_cpu = target; + WARN_ON(irq_set_affinity(pcie_pmu->irq, cpumask_of(target))); + + return 0; +} + +static struct attribute *hisi_pcie_pmu_events_attr[] = { + HISI_PCIE_PMU_EVENT_ATTR(rx_mwr_latency, 0x0010), + HISI_PCIE_PMU_EVENT_ATTR(rx_mwr_cnt, 0x10010), + HISI_PCIE_PMU_EVENT_ATTR(rx_mrd_latency, 0x0210), + HISI_PCIE_PMU_EVENT_ATTR(rx_mrd_cnt, 0x10210), + HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_latency, 0x0011), + HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_cnt, 0x10011), + HISI_PCIE_PMU_EVENT_ATTR(rx_mwr_flux, 0x0104), + HISI_PCIE_PMU_EVENT_ATTR(rx_mwr_time, 0x10104), + HISI_PCIE_PMU_EVENT_ATTR(rx_mrd_flux, 0x0804), + HISI_PCIE_PMU_EVENT_ATTR(rx_mrd_time, 0x10804), + HISI_PCIE_PMU_EVENT_ATTR(rx_cpl_flux, 0x2004), + HISI_PCIE_PMU_EVENT_ATTR(rx_cpl_time, 0x12004), + HISI_PCIE_PMU_EVENT_ATTR(tx_mwr_flux, 0x0105), + HISI_PCIE_PMU_EVENT_ATTR(tx_mwr_time, 0x10105), + HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_flux, 0x0405), + HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_time, 0x10405), + HISI_PCIE_PMU_EVENT_ATTR(tx_cpl_flux, 0x1005), + HISI_PCIE_PMU_EVENT_ATTR(tx_cpl_time, 0x11005), + NULL +}; + +static struct attribute_group hisi_pcie_pmu_events_group = { + .name = "events", + .attrs = hisi_pcie_pmu_events_attr, +}; + +static struct attribute *hisi_pcie_pmu_format_attr[] = { + HISI_PCIE_PMU_FORMAT_ATTR(event, "config:0-16"), + HISI_PCIE_PMU_FORMAT_ATTR(thr_len, "config1:0-3"), + HISI_PCIE_PMU_FORMAT_ATTR(thr_mode, "config1:4"), + HISI_PCIE_PMU_FORMAT_ATTR(trig_len, "config1:5-8"), + HISI_PCIE_PMU_FORMAT_ATTR(trig_mode, "config1:9"), + HISI_PCIE_PMU_FORMAT_ATTR(port, "config2:0-15"), + HISI_PCIE_PMU_FORMAT_ATTR(bdf, "config2:16-31"), + NULL +}; + +static const struct attribute_group hisi_pcie_pmu_format_group = { + .name = "format", + .attrs = hisi_pcie_pmu_format_attr, +}; + +static struct attribute *hisi_pcie_pmu_bus_attrs[] = { + &dev_attr_bus.attr, + &dev_attr_bdf_max.attr, + &dev_attr_bdf_min.attr, + NULL +}; + +static const struct attribute_group hisi_pcie_pmu_bus_attr_group = { + .attrs = hisi_pcie_pmu_bus_attrs, +}; + +static struct attribute *hisi_pcie_pmu_cpumask_attrs[] = { + &dev_attr_cpumask.attr, + NULL +}; + +static const struct attribute_group hisi_pcie_pmu_cpumask_attr_group = { + .attrs = hisi_pcie_pmu_cpumask_attrs, +}; + +static struct attribute *hisi_pcie_pmu_identifier_attrs[] = { + &dev_attr_identifier.attr, + NULL +}; + +static const struct attribute_group hisi_pcie_pmu_identifier_attr_group = { + .attrs = hisi_pcie_pmu_identifier_attrs, +}; + +static const struct attribute_group *hisi_pcie_pmu_attr_groups[] = { + &hisi_pcie_pmu_events_group, + &hisi_pcie_pmu_format_group, + &hisi_pcie_pmu_bus_attr_group, + &hisi_pcie_pmu_cpumask_attr_group, + &hisi_pcie_pmu_identifier_attr_group, + NULL +}; + +static int hisi_pcie_alloc_pmu(struct pci_dev *pdev, struct hisi_pcie_pmu *pcie_pmu) +{ + struct hisi_pcie_reg_pair regs; + u16 sicl_id, core_id; + char *name; + + regs = hisi_pcie_parse_reg_value(pcie_pmu, HISI_PCIE_REG_BDF); + pcie_pmu->bdf_min = regs.lo; + pcie_pmu->bdf_max = regs.hi; + + regs = hisi_pcie_parse_reg_value(pcie_pmu, HISI_PCIE_REG_INFO); + sicl_id = regs.hi; + core_id = regs.lo; + + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_pcie%u_core%u", sicl_id, core_id); + if (!name) + return -ENOMEM; + + pcie_pmu->pdev = pdev; + pcie_pmu->on_cpu = -1; + pcie_pmu->identifier = readl(pcie_pmu->base + HISI_PCIE_REG_VERSION); + pcie_pmu->pmu = (struct pmu) { + .name = name, + .module = THIS_MODULE, + .event_init = hisi_pcie_pmu_event_init, + .pmu_enable = hisi_pcie_pmu_enable, + .pmu_disable = hisi_pcie_pmu_disable, + .add = hisi_pcie_pmu_add, + .del = hisi_pcie_pmu_del, + .start = hisi_pcie_pmu_start, + .stop = hisi_pcie_pmu_stop, + .read = hisi_pcie_pmu_read, + .task_ctx_nr = perf_invalid_context, + .attr_groups = hisi_pcie_pmu_attr_groups, + .capabilities = PERF_PMU_CAP_NO_EXCLUDE, + }; + + return 0; +} + +static int hisi_pcie_init_pmu(struct pci_dev *pdev, struct hisi_pcie_pmu *pcie_pmu) +{ + int ret; + + pcie_pmu->base = pci_ioremap_bar(pdev, 2); + if (!pcie_pmu->base) { + pci_err(pdev, "Ioremap failed for pcie_pmu resource\n"); + return -ENOMEM; + } + + ret = hisi_pcie_alloc_pmu(pdev, pcie_pmu); + if (ret) + goto err_iounmap; + + ret = hisi_pcie_pmu_irq_register(pdev, pcie_pmu); + if (ret) + goto err_iounmap; + + ret = cpuhp_state_add_instance(CPUHP_AP_PERF_ARM_HISI_PCIE_PMU_ONLINE, &pcie_pmu->node); + if (ret) { + pci_err(pdev, "Failed to register hotplug: %d\n", ret); + goto err_irq_unregister; + } + + ret = perf_pmu_register(&pcie_pmu->pmu, pcie_pmu->pmu.name, -1); + if (ret) { + pci_err(pdev, "Failed to register PCIe PMU: %d\n", ret); + goto err_hotplug_unregister; + } + + return ret; + +err_hotplug_unregister: + cpuhp_state_remove_instance_nocalls( + CPUHP_AP_PERF_ARM_HISI_PCIE_PMU_ONLINE, &pcie_pmu->node); + +err_irq_unregister: + hisi_pcie_pmu_irq_unregister(pdev, pcie_pmu); + +err_iounmap: + iounmap(pcie_pmu->base); + + return ret; +} + +static void hisi_pcie_uninit_pmu(struct pci_dev *pdev) +{ + struct hisi_pcie_pmu *pcie_pmu = pci_get_drvdata(pdev); + + perf_pmu_unregister(&pcie_pmu->pmu); + cpuhp_state_remove_instance_nocalls( + CPUHP_AP_PERF_ARM_HISI_PCIE_PMU_ONLINE, &pcie_pmu->node); + hisi_pcie_pmu_irq_unregister(pdev, pcie_pmu); + iounmap(pcie_pmu->base); +} + +static int hisi_pcie_init_dev(struct pci_dev *pdev) +{ + int ret; + + ret = pcim_enable_device(pdev); + if (ret) { + pci_err(pdev, "Failed to enable PCI device: %d\n", ret); + return ret; + } + + ret = pcim_iomap_regions(pdev, BIT(2), DRV_NAME); + if (ret < 0) { + pci_err(pdev, "Failed to request PCI mem regions: %d\n", ret); + return ret; + } + + pci_set_master(pdev); + + return 0; +} + +static int hisi_pcie_pmu_probe(struct pci_dev *pdev, const struct pci_device_id *id) +{ + struct hisi_pcie_pmu *pcie_pmu; + int ret; + + pcie_pmu = devm_kzalloc(&pdev->dev, sizeof(*pcie_pmu), GFP_KERNEL); + if (!pcie_pmu) + return -ENOMEM; + + ret = hisi_pcie_init_dev(pdev); + if (ret) + return ret; + + ret = hisi_pcie_init_pmu(pdev, pcie_pmu); + if (ret) + return ret; + + pci_set_drvdata(pdev, pcie_pmu); + + return ret; +} + +static void hisi_pcie_pmu_remove(struct pci_dev *pdev) +{ + hisi_pcie_uninit_pmu(pdev); + pci_set_drvdata(pdev, NULL); +} + +static const struct pci_device_id hisi_pcie_pmu_ids[] = { + { PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, 0xa12d) }, + { 0, } +}; +MODULE_DEVICE_TABLE(pci, hisi_pcie_pmu_ids); + +static struct pci_driver hisi_pcie_pmu_driver = { + .name = DRV_NAME, + .id_table = hisi_pcie_pmu_ids, + .probe = hisi_pcie_pmu_probe, + .remove = hisi_pcie_pmu_remove, +}; + +static int __init hisi_pcie_module_init(void) +{ + int ret; + + ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_ARM_HISI_PCIE_PMU_ONLINE, + "AP_PERF_ARM_HISI_PCIE_PMU_ONLINE", + hisi_pcie_pmu_online_cpu, + hisi_pcie_pmu_offline_cpu); + if (ret) { + pr_err("Failed to setup PCIe PMU hotplug: %d\n", ret); + return ret; + } + + ret = pci_register_driver(&hisi_pcie_pmu_driver); + if (ret) + cpuhp_remove_multi_state(CPUHP_AP_PERF_ARM_HISI_PCIE_PMU_ONLINE); + + return ret; +} +module_init(hisi_pcie_module_init); + +static void __exit hisi_pcie_module_exit(void) +{ + pci_unregister_driver(&hisi_pcie_pmu_driver); + cpuhp_remove_multi_state(CPUHP_AP_PERF_ARM_HISI_PCIE_PMU_ONLINE); +} +module_exit(hisi_pcie_module_exit); + +MODULE_DESCRIPTION("HiSilicon PCIe PMU driver"); +MODULE_LICENSE("GPL v2"); +MODULE_AUTHOR("Qi Liu "); diff --git a/drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c b/drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c index b79c96b14328bc02497f9dfefe7367004a6c6210..2dd726c16461288bddee845b6769f2c43684ac76 100644 --- a/drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c +++ b/drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c @@ -14,12 +14,11 @@ #include #include #include -#include #include #include "hisi_uncore_pmu.h" -/* DDRC register definition */ +/* DDRC register definition in v1 */ #define DDRC_PERF_CTRL 0x010 #define DDRC_FLUX_WR 0x380 #define DDRC_FLUX_RD 0x384 @@ -33,13 +32,26 @@ #define DDRC_INT_MASK 0x6c8 #define DDRC_INT_STATUS 0x6cc #define DDRC_INT_CLEAR 0x6d0 +#define DDRC_VERSION 0x710 + +/* DDRC register definition in v2 */ +#define DDRC_V2_INT_MASK 0x528 +#define DDRC_V2_INT_STATUS 0x52c +#define DDRC_V2_INT_CLEAR 0x530 +#define DDRC_V2_EVENT_CNT 0xe00 +#define DDRC_V2_EVENT_CTRL 0xe70 +#define DDRC_V2_EVENT_TYPE 0xe74 +#define DDRC_V2_PERF_CTRL 0xeA0 /* DDRC has 8-counters */ #define DDRC_NR_COUNTERS 0x8 -#define DDRC_PERF_CTRL_EN 0x2 +#define DDRC_V1_PERF_CTRL_EN 0x2 +#define DDRC_V2_PERF_CTRL_EN 0x1 +#define DDRC_V1_NR_EVENTS 0x7 +#define DDRC_V2_NR_EVENTS 0x90 /* - * For DDRC PMU, there are eight-events and every event has been mapped + * For PMU v1, there are eight-events and every event has been mapped * to fixed-purpose counters which register offset is not consistent. * Therefore there is no write event type and we assume that event * code (0 to 7) is equal to counter index in PMU driver. @@ -53,73 +65,85 @@ static const u32 ddrc_reg_off[] = { /* * Select the counter register offset using the counter index. - * In DDRC there are no programmable counter, the count - * is readed form the statistics counter register itself. + * In PMU v1, there are no programmable counter, the count + * is read form the statistics counter register itself. */ -static u32 hisi_ddrc_pmu_get_counter_offset(int cntr_idx) +static u32 hisi_ddrc_pmu_v1_get_counter_offset(int cntr_idx) { return ddrc_reg_off[cntr_idx]; } -static u64 hisi_ddrc_pmu_read_counter(struct hisi_pmu *ddrc_pmu, - struct hw_perf_event *hwc) +static u32 hisi_ddrc_pmu_v2_get_counter_offset(int cntr_idx) { - /* Use event code as counter index */ - u32 idx = GET_DDRC_EVENTID(hwc); - - if (!hisi_uncore_pmu_counter_valid(ddrc_pmu, idx)) { - dev_err(ddrc_pmu->dev, "Unsupported event index:%d!\n", idx); - return 0; - } + return DDRC_V2_EVENT_CNT + cntr_idx * 8; +} - return readl(ddrc_pmu->base + hisi_ddrc_pmu_get_counter_offset(idx)); +static u64 hisi_ddrc_pmu_v1_read_counter(struct hisi_pmu *ddrc_pmu, + struct hw_perf_event *hwc) +{ + return readl(ddrc_pmu->base + + hisi_ddrc_pmu_v1_get_counter_offset(hwc->idx)); } -static void hisi_ddrc_pmu_write_counter(struct hisi_pmu *ddrc_pmu, +static void hisi_ddrc_pmu_v1_write_counter(struct hisi_pmu *ddrc_pmu, struct hw_perf_event *hwc, u64 val) { - u32 idx = GET_DDRC_EVENTID(hwc); + writel((u32)val, + ddrc_pmu->base + hisi_ddrc_pmu_v1_get_counter_offset(hwc->idx)); +} - if (!hisi_uncore_pmu_counter_valid(ddrc_pmu, idx)) { - dev_err(ddrc_pmu->dev, "Unsupported event index:%d!\n", idx); - return; - } +static u64 hisi_ddrc_pmu_v2_read_counter(struct hisi_pmu *ddrc_pmu, + struct hw_perf_event *hwc) +{ + return readq(ddrc_pmu->base + + hisi_ddrc_pmu_v2_get_counter_offset(hwc->idx)); +} - writel((u32)val, - ddrc_pmu->base + hisi_ddrc_pmu_get_counter_offset(idx)); +static void hisi_ddrc_pmu_v2_write_counter(struct hisi_pmu *ddrc_pmu, + struct hw_perf_event *hwc, u64 val) +{ + writeq(val, + ddrc_pmu->base + hisi_ddrc_pmu_v2_get_counter_offset(hwc->idx)); } /* - * For DDRC PMU, event has been mapped to fixed-purpose counter by hardware, - * so there is no need to write event type. + * For DDRC PMU v1, event has been mapped to fixed-purpose counter by hardware, + * so there is no need to write event type, while it is programmable counter in + * PMU v2. */ static void hisi_ddrc_pmu_write_evtype(struct hisi_pmu *hha_pmu, int idx, u32 type) { + u32 offset; + + if (hha_pmu->identifier >= HISI_PMU_V2) { + offset = DDRC_V2_EVENT_TYPE + 4 * idx; + writel(type, hha_pmu->base + offset); + } } -static void hisi_ddrc_pmu_start_counters(struct hisi_pmu *ddrc_pmu) +static void hisi_ddrc_pmu_v1_start_counters(struct hisi_pmu *ddrc_pmu) { u32 val; /* Set perf_enable in DDRC_PERF_CTRL to start event counting */ val = readl(ddrc_pmu->base + DDRC_PERF_CTRL); - val |= DDRC_PERF_CTRL_EN; + val |= DDRC_V1_PERF_CTRL_EN; writel(val, ddrc_pmu->base + DDRC_PERF_CTRL); } -static void hisi_ddrc_pmu_stop_counters(struct hisi_pmu *ddrc_pmu) +static void hisi_ddrc_pmu_v1_stop_counters(struct hisi_pmu *ddrc_pmu) { u32 val; /* Clear perf_enable in DDRC_PERF_CTRL to stop event counting */ val = readl(ddrc_pmu->base + DDRC_PERF_CTRL); - val &= ~DDRC_PERF_CTRL_EN; + val &= ~DDRC_V1_PERF_CTRL_EN; writel(val, ddrc_pmu->base + DDRC_PERF_CTRL); } -static void hisi_ddrc_pmu_enable_counter(struct hisi_pmu *ddrc_pmu, - struct hw_perf_event *hwc) +static void hisi_ddrc_pmu_v1_enable_counter(struct hisi_pmu *ddrc_pmu, + struct hw_perf_event *hwc) { u32 val; @@ -129,8 +153,8 @@ static void hisi_ddrc_pmu_enable_counter(struct hisi_pmu *ddrc_pmu, writel(val, ddrc_pmu->base + DDRC_EVENT_CTRL); } -static void hisi_ddrc_pmu_disable_counter(struct hisi_pmu *ddrc_pmu, - struct hw_perf_event *hwc) +static void hisi_ddrc_pmu_v1_disable_counter(struct hisi_pmu *ddrc_pmu, + struct hw_perf_event *hwc) { u32 val; @@ -140,7 +164,7 @@ static void hisi_ddrc_pmu_disable_counter(struct hisi_pmu *ddrc_pmu, writel(val, ddrc_pmu->base + DDRC_EVENT_CTRL); } -static int hisi_ddrc_pmu_get_event_idx(struct perf_event *event) +static int hisi_ddrc_pmu_v1_get_event_idx(struct perf_event *event) { struct hisi_pmu *ddrc_pmu = to_hisi_pmu(event->pmu); unsigned long *used_mask = ddrc_pmu->pmu_events.used_mask; @@ -156,87 +180,117 @@ static int hisi_ddrc_pmu_get_event_idx(struct perf_event *event) return idx; } -static void hisi_ddrc_pmu_enable_counter_int(struct hisi_pmu *ddrc_pmu, +static int hisi_ddrc_pmu_v2_get_event_idx(struct perf_event *event) +{ + return hisi_uncore_pmu_get_event_idx(event); +} + +static void hisi_ddrc_pmu_v2_start_counters(struct hisi_pmu *ddrc_pmu) +{ + u32 val; + + val = readl(ddrc_pmu->base + DDRC_V2_PERF_CTRL); + val |= DDRC_V2_PERF_CTRL_EN; + writel(val, ddrc_pmu->base + DDRC_V2_PERF_CTRL); +} + +static void hisi_ddrc_pmu_v2_stop_counters(struct hisi_pmu *ddrc_pmu) +{ + u32 val; + + val = readl(ddrc_pmu->base + DDRC_V2_PERF_CTRL); + val &= ~DDRC_V2_PERF_CTRL_EN; + writel(val, ddrc_pmu->base + DDRC_V2_PERF_CTRL); +} + +static void hisi_ddrc_pmu_v2_enable_counter(struct hisi_pmu *ddrc_pmu, + struct hw_perf_event *hwc) +{ + u32 val; + + val = readl(ddrc_pmu->base + DDRC_V2_EVENT_CTRL); + val |= 1 << hwc->idx; + writel(val, ddrc_pmu->base + DDRC_V2_EVENT_CTRL); +} + +static void hisi_ddrc_pmu_v2_disable_counter(struct hisi_pmu *ddrc_pmu, struct hw_perf_event *hwc) { u32 val; + val = readl(ddrc_pmu->base + DDRC_V2_EVENT_CTRL); + val &= ~(1 << hwc->idx); + writel(val, ddrc_pmu->base + DDRC_V2_EVENT_CTRL); +} + +static void hisi_ddrc_pmu_v1_enable_counter_int(struct hisi_pmu *ddrc_pmu, + struct hw_perf_event *hwc) +{ + u32 val; + /* Write 0 to enable interrupt */ val = readl(ddrc_pmu->base + DDRC_INT_MASK); - val &= ~(1 << GET_DDRC_EVENTID(hwc)); + val &= ~(1 << hwc->idx); writel(val, ddrc_pmu->base + DDRC_INT_MASK); } -static void hisi_ddrc_pmu_disable_counter_int(struct hisi_pmu *ddrc_pmu, - struct hw_perf_event *hwc) +static void hisi_ddrc_pmu_v1_disable_counter_int(struct hisi_pmu *ddrc_pmu, + struct hw_perf_event *hwc) { u32 val; /* Write 1 to mask interrupt */ val = readl(ddrc_pmu->base + DDRC_INT_MASK); - val |= (1 << GET_DDRC_EVENTID(hwc)); + val |= 1 << hwc->idx; writel(val, ddrc_pmu->base + DDRC_INT_MASK); } -static irqreturn_t hisi_ddrc_pmu_isr(int irq, void *dev_id) +static void hisi_ddrc_pmu_v2_enable_counter_int(struct hisi_pmu *ddrc_pmu, + struct hw_perf_event *hwc) { - struct hisi_pmu *ddrc_pmu = dev_id; - struct perf_event *event; - unsigned long overflown; - int idx; - - /* Read the DDRC_INT_STATUS register */ - overflown = readl(ddrc_pmu->base + DDRC_INT_STATUS); - if (!overflown) - return IRQ_NONE; + u32 val; - /* - * Find the counter index which overflowed if the bit was set - * and handle it - */ - for_each_set_bit(idx, &overflown, DDRC_NR_COUNTERS) { - /* Write 1 to clear the IRQ status flag */ - writel((1 << idx), ddrc_pmu->base + DDRC_INT_CLEAR); + val = readl(ddrc_pmu->base + DDRC_V2_INT_MASK); + val &= ~(1 << hwc->idx); + writel(val, ddrc_pmu->base + DDRC_V2_INT_MASK); +} - /* Get the corresponding event struct */ - event = ddrc_pmu->pmu_events.hw_events[idx]; - if (!event) - continue; +static void hisi_ddrc_pmu_v2_disable_counter_int(struct hisi_pmu *ddrc_pmu, + struct hw_perf_event *hwc) +{ + u32 val; - hisi_uncore_pmu_event_update(event); - hisi_uncore_pmu_set_event_period(event); - } + val = readl(ddrc_pmu->base + DDRC_V2_INT_MASK); + val |= 1 << hwc->idx; + writel(val, ddrc_pmu->base + DDRC_V2_INT_MASK); +} - return IRQ_HANDLED; +static u32 hisi_ddrc_pmu_v1_get_int_status(struct hisi_pmu *ddrc_pmu) +{ + return readl(ddrc_pmu->base + DDRC_INT_STATUS); } -static int hisi_ddrc_pmu_init_irq(struct hisi_pmu *ddrc_pmu, - struct platform_device *pdev) +static void hisi_ddrc_pmu_v1_clear_int_status(struct hisi_pmu *ddrc_pmu, + int idx) { - int irq, ret; - - /* Read and init IRQ */ - irq = platform_get_irq(pdev, 0); - if (irq < 0) - return irq; - - ret = devm_request_irq(&pdev->dev, irq, hisi_ddrc_pmu_isr, - IRQF_NOBALANCING | IRQF_NO_THREAD, - dev_name(&pdev->dev), ddrc_pmu); - if (ret < 0) { - dev_err(&pdev->dev, - "Fail to request IRQ:%d ret:%d\n", irq, ret); - return ret; - } + writel(1 << idx, ddrc_pmu->base + DDRC_INT_CLEAR); +} - ddrc_pmu->irq = irq; +static u32 hisi_ddrc_pmu_v2_get_int_status(struct hisi_pmu *ddrc_pmu) +{ + return readl(ddrc_pmu->base + DDRC_V2_INT_STATUS); +} - return 0; +static void hisi_ddrc_pmu_v2_clear_int_status(struct hisi_pmu *ddrc_pmu, + int idx) +{ + writel(1 << idx, ddrc_pmu->base + DDRC_V2_INT_CLEAR); } static const struct acpi_device_id hisi_ddrc_pmu_acpi_match[] = { { "HISI0233", }, - {}, + { "HISI0234", }, + {} }; MODULE_DEVICE_TABLE(acpi, hisi_ddrc_pmu_acpi_match); @@ -270,20 +324,39 @@ static int hisi_ddrc_pmu_init_data(struct platform_device *pdev, return PTR_ERR(ddrc_pmu->base); } + ddrc_pmu->identifier = readl(ddrc_pmu->base + DDRC_VERSION); + if (ddrc_pmu->identifier >= HISI_PMU_V2) { + if (device_property_read_u32(&pdev->dev, "hisilicon,sub-id", + &ddrc_pmu->sub_id)) { + dev_err(&pdev->dev, "Can not read sub-id!\n"); + return -EINVAL; + } + } + return 0; } -static struct attribute *hisi_ddrc_pmu_format_attr[] = { +static struct attribute *hisi_ddrc_pmu_v1_format_attr[] = { HISI_PMU_FORMAT_ATTR(event, "config:0-4"), NULL, }; -static const struct attribute_group hisi_ddrc_pmu_format_group = { +static const struct attribute_group hisi_ddrc_pmu_v1_format_group = { + .name = "format", + .attrs = hisi_ddrc_pmu_v1_format_attr, +}; + +static struct attribute *hisi_ddrc_pmu_v2_format_attr[] = { + HISI_PMU_FORMAT_ATTR(event, "config:0-7"), + NULL +}; + +static const struct attribute_group hisi_ddrc_pmu_v2_format_group = { .name = "format", - .attrs = hisi_ddrc_pmu_format_attr, + .attrs = hisi_ddrc_pmu_v2_format_attr, }; -static struct attribute *hisi_ddrc_pmu_events_attr[] = { +static struct attribute *hisi_ddrc_pmu_v1_events_attr[] = { HISI_PMU_EVENT_ATTR(flux_wr, 0x00), HISI_PMU_EVENT_ATTR(flux_rd, 0x01), HISI_PMU_EVENT_ATTR(flux_wcmd, 0x02), @@ -295,9 +368,21 @@ static struct attribute *hisi_ddrc_pmu_events_attr[] = { NULL, }; -static const struct attribute_group hisi_ddrc_pmu_events_group = { +static const struct attribute_group hisi_ddrc_pmu_v1_events_group = { .name = "events", - .attrs = hisi_ddrc_pmu_events_attr, + .attrs = hisi_ddrc_pmu_v1_events_attr, +}; + +static struct attribute *hisi_ddrc_pmu_v2_events_attr[] = { + HISI_PMU_EVENT_ATTR(cycles, 0x00), + HISI_PMU_EVENT_ATTR(flux_wr, 0x83), + HISI_PMU_EVENT_ATTR(flux_rd, 0x84), + NULL +}; + +static const struct attribute_group hisi_ddrc_pmu_v2_events_group = { + .name = "events", + .attrs = hisi_ddrc_pmu_v2_events_attr, }; static DEVICE_ATTR(cpumask, 0444, hisi_cpumask_sysfs_show, NULL); @@ -311,24 +396,62 @@ static const struct attribute_group hisi_ddrc_pmu_cpumask_attr_group = { .attrs = hisi_ddrc_pmu_cpumask_attrs, }; -static const struct attribute_group *hisi_ddrc_pmu_attr_groups[] = { - &hisi_ddrc_pmu_format_group, - &hisi_ddrc_pmu_events_group, +static struct device_attribute hisi_ddrc_pmu_identifier_attr = + __ATTR(identifier, 0444, hisi_uncore_pmu_identifier_attr_show, NULL); + +static struct attribute *hisi_ddrc_pmu_identifier_attrs[] = { + &hisi_ddrc_pmu_identifier_attr.attr, + NULL +}; + +static struct attribute_group hisi_ddrc_pmu_identifier_group = { + .attrs = hisi_ddrc_pmu_identifier_attrs, +}; + +static const struct attribute_group *hisi_ddrc_pmu_v1_attr_groups[] = { + &hisi_ddrc_pmu_v1_format_group, + &hisi_ddrc_pmu_v1_events_group, &hisi_ddrc_pmu_cpumask_attr_group, + &hisi_ddrc_pmu_identifier_group, NULL, }; -static const struct hisi_uncore_ops hisi_uncore_ddrc_ops = { +static const struct attribute_group *hisi_ddrc_pmu_v2_attr_groups[] = { + &hisi_ddrc_pmu_v2_format_group, + &hisi_ddrc_pmu_v2_events_group, + &hisi_ddrc_pmu_cpumask_attr_group, + &hisi_ddrc_pmu_identifier_group, + NULL +}; + +static const struct hisi_uncore_ops hisi_uncore_ddrc_v1_ops = { .write_evtype = hisi_ddrc_pmu_write_evtype, - .get_event_idx = hisi_ddrc_pmu_get_event_idx, - .start_counters = hisi_ddrc_pmu_start_counters, - .stop_counters = hisi_ddrc_pmu_stop_counters, - .enable_counter = hisi_ddrc_pmu_enable_counter, - .disable_counter = hisi_ddrc_pmu_disable_counter, - .enable_counter_int = hisi_ddrc_pmu_enable_counter_int, - .disable_counter_int = hisi_ddrc_pmu_disable_counter_int, - .write_counter = hisi_ddrc_pmu_write_counter, - .read_counter = hisi_ddrc_pmu_read_counter, + .get_event_idx = hisi_ddrc_pmu_v1_get_event_idx, + .start_counters = hisi_ddrc_pmu_v1_start_counters, + .stop_counters = hisi_ddrc_pmu_v1_stop_counters, + .enable_counter = hisi_ddrc_pmu_v1_enable_counter, + .disable_counter = hisi_ddrc_pmu_v1_disable_counter, + .enable_counter_int = hisi_ddrc_pmu_v1_enable_counter_int, + .disable_counter_int = hisi_ddrc_pmu_v1_disable_counter_int, + .write_counter = hisi_ddrc_pmu_v1_write_counter, + .read_counter = hisi_ddrc_pmu_v1_read_counter, + .get_int_status = hisi_ddrc_pmu_v1_get_int_status, + .clear_int_status = hisi_ddrc_pmu_v1_clear_int_status, +}; + +static const struct hisi_uncore_ops hisi_uncore_ddrc_v2_ops = { + .write_evtype = hisi_ddrc_pmu_write_evtype, + .get_event_idx = hisi_ddrc_pmu_v2_get_event_idx, + .start_counters = hisi_ddrc_pmu_v2_start_counters, + .stop_counters = hisi_ddrc_pmu_v2_stop_counters, + .enable_counter = hisi_ddrc_pmu_v2_enable_counter, + .disable_counter = hisi_ddrc_pmu_v2_disable_counter, + .enable_counter_int = hisi_ddrc_pmu_v2_enable_counter_int, + .disable_counter_int = hisi_ddrc_pmu_v2_disable_counter_int, + .write_counter = hisi_ddrc_pmu_v2_write_counter, + .read_counter = hisi_ddrc_pmu_v2_read_counter, + .get_int_status = hisi_ddrc_pmu_v2_get_int_status, + .clear_int_status = hisi_ddrc_pmu_v2_clear_int_status, }; static int hisi_ddrc_pmu_dev_probe(struct platform_device *pdev, @@ -340,16 +463,25 @@ static int hisi_ddrc_pmu_dev_probe(struct platform_device *pdev, if (ret) return ret; - ret = hisi_ddrc_pmu_init_irq(ddrc_pmu, pdev); + ret = hisi_uncore_pmu_init_irq(ddrc_pmu, pdev); if (ret) return ret; + if (ddrc_pmu->identifier >= HISI_PMU_V2) { + ddrc_pmu->counter_bits = 48; + ddrc_pmu->check_event = DDRC_V2_NR_EVENTS; + ddrc_pmu->pmu_events.attr_groups = hisi_ddrc_pmu_v2_attr_groups; + ddrc_pmu->ops = &hisi_uncore_ddrc_v2_ops; + } else { + ddrc_pmu->counter_bits = 32; + ddrc_pmu->check_event = DDRC_V1_NR_EVENTS; + ddrc_pmu->pmu_events.attr_groups = hisi_ddrc_pmu_v1_attr_groups; + ddrc_pmu->ops = &hisi_uncore_ddrc_v1_ops; + } + ddrc_pmu->num_counters = DDRC_NR_COUNTERS; - ddrc_pmu->counter_bits = 32; - ddrc_pmu->ops = &hisi_uncore_ddrc_ops; ddrc_pmu->dev = &pdev->dev; ddrc_pmu->on_cpu = -1; - ddrc_pmu->check_event = 7; return 0; } @@ -370,15 +502,17 @@ static int hisi_ddrc_pmu_probe(struct platform_device *pdev) if (ret) return ret; - ret = cpuhp_state_add_instance(CPUHP_AP_PERF_ARM_HISI_DDRC_ONLINE, - &ddrc_pmu->node); - if (ret) { - dev_err(&pdev->dev, "Error %d registering hotplug;\n", ret); - return ret; - } - name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%u_ddrc%u", - ddrc_pmu->sccl_id, ddrc_pmu->index_id); + if (ddrc_pmu->identifier >= HISI_PMU_V2) + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, + "hisi_sccl%u_ddrc%u_%u", + ddrc_pmu->sccl_id, ddrc_pmu->index_id, + ddrc_pmu->sub_id); + else + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, + "hisi_sccl%u_ddrc%u", ddrc_pmu->sccl_id, + ddrc_pmu->index_id); + ddrc_pmu->pmu = (struct pmu) { .name = name, .module = THIS_MODULE, @@ -391,15 +525,26 @@ static int hisi_ddrc_pmu_probe(struct platform_device *pdev) .start = hisi_uncore_pmu_start, .stop = hisi_uncore_pmu_stop, .read = hisi_uncore_pmu_read, - .attr_groups = hisi_ddrc_pmu_attr_groups, + .attr_groups = ddrc_pmu->pmu_events.attr_groups, .capabilities = PERF_PMU_CAP_NO_EXCLUDE, }; + if (!name) + return -ENOMEM; + + ret = cpuhp_state_add_instance(CPUHP_AP_PERF_ARM_HISI_DDRC_ONLINE, + &ddrc_pmu->node); + if (ret) { + dev_err(&pdev->dev, "Error %d registering hotplug;\n", ret); + return ret; + } + ret = perf_pmu_register(&ddrc_pmu->pmu, name, -1); if (ret) { dev_err(ddrc_pmu->dev, "DDRC PMU register failed!\n"); - cpuhp_state_remove_instance(CPUHP_AP_PERF_ARM_HISI_DDRC_ONLINE, - &ddrc_pmu->node); + cpuhp_state_remove_instance_nocalls( + CPUHP_AP_PERF_ARM_HISI_DDRC_ONLINE, &ddrc_pmu->node); + irq_set_affinity_hint(ddrc_pmu->irq, NULL); } return ret; @@ -410,8 +555,9 @@ static int hisi_ddrc_pmu_remove(struct platform_device *pdev) struct hisi_pmu *ddrc_pmu = platform_get_drvdata(pdev); perf_pmu_unregister(&ddrc_pmu->pmu); - cpuhp_state_remove_instance(CPUHP_AP_PERF_ARM_HISI_DDRC_ONLINE, - &ddrc_pmu->node); + cpuhp_state_remove_instance_nocalls(CPUHP_AP_PERF_ARM_HISI_DDRC_ONLINE, + &ddrc_pmu->node); + irq_set_affinity_hint(ddrc_pmu->irq, NULL); return 0; } diff --git a/drivers/perf/hisilicon/hisi_uncore_hha_pmu.c b/drivers/perf/hisilicon/hisi_uncore_hha_pmu.c index 78865b4ac4a6f25a753700eb5513df81b8eebeee..9ae049270fe56681c0da7229f0c28c80808d4c51 100644 --- a/drivers/perf/hisilicon/hisi_uncore_hha_pmu.c +++ b/drivers/perf/hisilicon/hisi_uncore_hha_pmu.c @@ -14,7 +14,6 @@ #include #include #include -#include #include #include "hisi_uncore_pmu.h" @@ -23,6 +22,7 @@ #define HHA_INT_MASK 0x0804 #define HHA_INT_STATUS 0x0808 #define HHA_INT_CLEAR 0x080C +#define HHA_VERSION 0x1cf0 #define HHA_PERF_CTRL 0x1E00 #define HHA_EVENT_CTRL 0x1E04 #define HHA_EVENT_TYPE0 0x1E80 @@ -33,10 +33,11 @@ #define HHA_CNT0_LOWER 0x1F00 /* HHA has 16-counters */ -#define HHA_NR_COUNTERS 0x10 +#define HHA_V1_NR_COUNTERS 0x10 #define HHA_PERF_CTRL_EN 0x1 #define HHA_EVTYPE_NONE 0xff +#define HHA_V1_NR_EVENT 0x65 /* * Select the counter register offset using the counter index @@ -50,29 +51,15 @@ static u32 hisi_hha_pmu_get_counter_offset(int cntr_idx) static u64 hisi_hha_pmu_read_counter(struct hisi_pmu *hha_pmu, struct hw_perf_event *hwc) { - u32 idx = hwc->idx; - - if (!hisi_uncore_pmu_counter_valid(hha_pmu, idx)) { - dev_err(hha_pmu->dev, "Unsupported event index:%d!\n", idx); - return 0; - } - /* Read 64 bits and like L3C, top 16 bits are RAZ */ - return readq(hha_pmu->base + hisi_hha_pmu_get_counter_offset(idx)); + return readq(hha_pmu->base + hisi_hha_pmu_get_counter_offset(hwc->idx)); } static void hisi_hha_pmu_write_counter(struct hisi_pmu *hha_pmu, struct hw_perf_event *hwc, u64 val) { - u32 idx = hwc->idx; - - if (!hisi_uncore_pmu_counter_valid(hha_pmu, idx)) { - dev_err(hha_pmu->dev, "Unsupported event index:%d!\n", idx); - return; - } - /* Write 64 bits and like L3C, top 16 bits are WI */ - writeq(val, hha_pmu->base + hisi_hha_pmu_get_counter_offset(idx)); + writeq(val, hha_pmu->base + hisi_hha_pmu_get_counter_offset(hwc->idx)); } static void hisi_hha_pmu_write_evtype(struct hisi_pmu *hha_pmu, int idx, @@ -168,60 +155,14 @@ static void hisi_hha_pmu_disable_counter_int(struct hisi_pmu *hha_pmu, writel(val, hha_pmu->base + HHA_INT_MASK); } -static irqreturn_t hisi_hha_pmu_isr(int irq, void *dev_id) +static u32 hisi_hha_pmu_get_int_status(struct hisi_pmu *hha_pmu) { - struct hisi_pmu *hha_pmu = dev_id; - struct perf_event *event; - unsigned long overflown; - int idx; - - /* Read HHA_INT_STATUS register */ - overflown = readl(hha_pmu->base + HHA_INT_STATUS); - if (!overflown) - return IRQ_NONE; - - /* - * Find the counter index which overflowed if the bit was set - * and handle it - */ - for_each_set_bit(idx, &overflown, HHA_NR_COUNTERS) { - /* Write 1 to clear the IRQ status flag */ - writel((1 << idx), hha_pmu->base + HHA_INT_CLEAR); - - /* Get the corresponding event struct */ - event = hha_pmu->pmu_events.hw_events[idx]; - if (!event) - continue; - - hisi_uncore_pmu_event_update(event); - hisi_uncore_pmu_set_event_period(event); - } - - return IRQ_HANDLED; + return readl(hha_pmu->base + HHA_INT_STATUS); } -static int hisi_hha_pmu_init_irq(struct hisi_pmu *hha_pmu, - struct platform_device *pdev) +static void hisi_hha_pmu_clear_int_status(struct hisi_pmu *hha_pmu, int idx) { - int irq, ret; - - /* Read and init IRQ */ - irq = platform_get_irq(pdev, 0); - if (irq < 0) - return irq; - - ret = devm_request_irq(&pdev->dev, irq, hisi_hha_pmu_isr, - IRQF_NOBALANCING | IRQF_NO_THREAD, - dev_name(&pdev->dev), hha_pmu); - if (ret < 0) { - dev_err(&pdev->dev, - "Fail to request IRQ:%d ret:%d\n", irq, ret); - return ret; - } - - hha_pmu->irq = irq; - - return 0; + writel(1 << idx, hha_pmu->base + HHA_INT_CLEAR); } static const struct acpi_device_id hisi_hha_pmu_acpi_match[] = { @@ -263,20 +204,22 @@ static int hisi_hha_pmu_init_data(struct platform_device *pdev, return PTR_ERR(hha_pmu->base); } + hha_pmu->identifier = readl(hha_pmu->base + HHA_VERSION); + return 0; } -static struct attribute *hisi_hha_pmu_format_attr[] = { +static struct attribute *hisi_hha_pmu_v1_format_attr[] = { HISI_PMU_FORMAT_ATTR(event, "config:0-7"), NULL, }; -static const struct attribute_group hisi_hha_pmu_format_group = { +static const struct attribute_group hisi_hha_pmu_v1_format_group = { .name = "format", - .attrs = hisi_hha_pmu_format_attr, + .attrs = hisi_hha_pmu_v1_format_attr, }; -static struct attribute *hisi_hha_pmu_events_attr[] = { +static struct attribute *hisi_hha_pmu_v1_events_attr[] = { HISI_PMU_EVENT_ATTR(rx_ops_num, 0x00), HISI_PMU_EVENT_ATTR(rx_outer, 0x01), HISI_PMU_EVENT_ATTR(rx_sccl, 0x02), @@ -306,9 +249,9 @@ static struct attribute *hisi_hha_pmu_events_attr[] = { NULL, }; -static const struct attribute_group hisi_hha_pmu_events_group = { +static const struct attribute_group hisi_hha_pmu_v1_events_group = { .name = "events", - .attrs = hisi_hha_pmu_events_attr, + .attrs = hisi_hha_pmu_v1_events_attr, }; static DEVICE_ATTR(cpumask, 0444, hisi_cpumask_sysfs_show, NULL); @@ -322,10 +265,23 @@ static const struct attribute_group hisi_hha_pmu_cpumask_attr_group = { .attrs = hisi_hha_pmu_cpumask_attrs, }; -static const struct attribute_group *hisi_hha_pmu_attr_groups[] = { - &hisi_hha_pmu_format_group, - &hisi_hha_pmu_events_group, +static struct device_attribute hisi_hha_pmu_identifier_attr = + __ATTR(identifier, 0444, hisi_uncore_pmu_identifier_attr_show, NULL); + +static struct attribute *hisi_hha_pmu_identifier_attrs[] = { + &hisi_hha_pmu_identifier_attr.attr, + NULL +}; + +static struct attribute_group hisi_hha_pmu_identifier_group = { + .attrs = hisi_hha_pmu_identifier_attrs, +}; + +static const struct attribute_group *hisi_hha_pmu_v1_attr_groups[] = { + &hisi_hha_pmu_v1_format_group, + &hisi_hha_pmu_v1_events_group, &hisi_hha_pmu_cpumask_attr_group, + &hisi_hha_pmu_identifier_group, NULL, }; @@ -340,6 +296,8 @@ static const struct hisi_uncore_ops hisi_uncore_hha_ops = { .disable_counter_int = hisi_hha_pmu_disable_counter_int, .write_counter = hisi_hha_pmu_write_counter, .read_counter = hisi_hha_pmu_read_counter, + .get_int_status = hisi_hha_pmu_get_int_status, + .clear_int_status = hisi_hha_pmu_clear_int_status, }; static int hisi_hha_pmu_dev_probe(struct platform_device *pdev, @@ -351,16 +309,16 @@ static int hisi_hha_pmu_dev_probe(struct platform_device *pdev, if (ret) return ret; - ret = hisi_hha_pmu_init_irq(hha_pmu, pdev); + ret = hisi_uncore_pmu_init_irq(hha_pmu, pdev); if (ret) return ret; - hha_pmu->num_counters = HHA_NR_COUNTERS; + hha_pmu->num_counters = HHA_V1_NR_COUNTERS; hha_pmu->counter_bits = 48; hha_pmu->ops = &hisi_uncore_hha_ops; hha_pmu->dev = &pdev->dev; hha_pmu->on_cpu = -1; - hha_pmu->check_event = 0x65; + hha_pmu->check_event = HHA_V1_NR_EVENT; return 0; } @@ -381,6 +339,11 @@ static int hisi_hha_pmu_probe(struct platform_device *pdev) if (ret) return ret; + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%u_hha%u", + hha_pmu->sccl_id, hha_pmu->index_id); + if (!name) + return -ENOMEM; + ret = cpuhp_state_add_instance(CPUHP_AP_PERF_ARM_HISI_HHA_ONLINE, &hha_pmu->node); if (ret) { @@ -388,8 +351,6 @@ static int hisi_hha_pmu_probe(struct platform_device *pdev) return ret; } - name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%u_hha%u", - hha_pmu->sccl_id, hha_pmu->index_id); hha_pmu->pmu = (struct pmu) { .name = name, .module = THIS_MODULE, @@ -402,15 +363,16 @@ static int hisi_hha_pmu_probe(struct platform_device *pdev) .start = hisi_uncore_pmu_start, .stop = hisi_uncore_pmu_stop, .read = hisi_uncore_pmu_read, - .attr_groups = hisi_hha_pmu_attr_groups, + .attr_groups = hisi_hha_pmu_v1_attr_groups, .capabilities = PERF_PMU_CAP_NO_EXCLUDE, }; ret = perf_pmu_register(&hha_pmu->pmu, name, -1); if (ret) { dev_err(hha_pmu->dev, "HHA PMU register failed!\n"); - cpuhp_state_remove_instance(CPUHP_AP_PERF_ARM_HISI_HHA_ONLINE, - &hha_pmu->node); + cpuhp_state_remove_instance_nocalls( + CPUHP_AP_PERF_ARM_HISI_HHA_ONLINE, &hha_pmu->node); + irq_set_affinity_hint(hha_pmu->irq, NULL); } return ret; @@ -421,8 +383,9 @@ static int hisi_hha_pmu_remove(struct platform_device *pdev) struct hisi_pmu *hha_pmu = platform_get_drvdata(pdev); perf_pmu_unregister(&hha_pmu->pmu); - cpuhp_state_remove_instance(CPUHP_AP_PERF_ARM_HISI_HHA_ONLINE, - &hha_pmu->node); + cpuhp_state_remove_instance_nocalls(CPUHP_AP_PERF_ARM_HISI_HHA_ONLINE, + &hha_pmu->node); + irq_set_affinity_hint(hha_pmu->irq, NULL); return 0; } diff --git a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c index 9dd50c3bc74ecb85fee2aec8a54df584e45b2235..66227265de704745d340ba42feb4b6fe4df81fa9 100644 --- a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c +++ b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c @@ -14,7 +14,6 @@ #include #include #include -#include #include #include "hisi_uncore_pmu.h" @@ -24,11 +23,17 @@ #define L3C_INT_MASK 0x0800 #define L3C_INT_STATUS 0x0808 #define L3C_INT_CLEAR 0x080c +#define L3C_CORE_CTRL 0x1b04 +#define L3C_TRACETAG_CTRL 0x1b20 +#define L3C_DATSRC_TYPE 0x1b48 +#define L3C_DATSRC_CTRL 0x1bf0 #define L3C_EVENT_CTRL 0x1c00 +#define L3C_VERSION 0x1cf0 #define L3C_EVENT_TYPE0 0x1d00 /* - * Each counter is 48-bits and [48:63] are reserved - * which are Read-As-Zero and Writes-Ignored. + * If the HW version only supports a 48-bit counter, then + * bits [63:48] are reserved, which are Read-As-Zero and + * Writes-Ignored. */ #define L3C_CNTR0_LOWER 0x1e00 @@ -36,7 +41,186 @@ #define L3C_NR_COUNTERS 0x8 #define L3C_PERF_CTRL_EN 0x10000 +#define L3C_TRACETAG_EN BIT(31) +#define L3C_TRACETAG_REQ_SHIFT 7 +#define L3C_TRACETAG_MARK_EN BIT(0) +#define L3C_TRACETAG_REQ_EN (L3C_TRACETAG_MARK_EN | BIT(2)) +#define L3C_TRACETAG_CORE_EN (L3C_TRACETAG_MARK_EN | BIT(3)) +#define L3C_CORE_EN BIT(20) +#define L3C_COER_NONE 0x0 +#define L3C_DATSRC_MASK 0xFF +#define L3C_DATSRC_SKT_EN BIT(23) +#define L3C_DATSRC_NONE 0x0 #define L3C_EVTYPE_NONE 0xff +#define L3C_V1_NR_EVENTS 0x59 +#define L3C_V2_NR_EVENTS 0xFF + +HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_core, config1, 7, 0); +HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_req, config1, 10, 8); +HISI_PMU_EVENT_ATTR_EXTRACTOR(datasrc_cfg, config1, 15, 11); +HISI_PMU_EVENT_ATTR_EXTRACTOR(datasrc_skt, config1, 16, 16); + +static void hisi_l3c_pmu_config_req_tracetag(struct perf_event *event) +{ + struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); + u32 tt_req = hisi_get_tt_req(event); + + if (tt_req) { + u32 val; + + /* Set request-type for tracetag */ + val = readl(l3c_pmu->base + L3C_TRACETAG_CTRL); + val |= tt_req << L3C_TRACETAG_REQ_SHIFT; + val |= L3C_TRACETAG_REQ_EN; + writel(val, l3c_pmu->base + L3C_TRACETAG_CTRL); + + /* Enable request-tracetag statistics */ + val = readl(l3c_pmu->base + L3C_PERF_CTRL); + val |= L3C_TRACETAG_EN; + writel(val, l3c_pmu->base + L3C_PERF_CTRL); + } +} + +static void hisi_l3c_pmu_clear_req_tracetag(struct perf_event *event) +{ + struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); + u32 tt_req = hisi_get_tt_req(event); + + if (tt_req) { + u32 val; + + /* Clear request-type */ + val = readl(l3c_pmu->base + L3C_TRACETAG_CTRL); + val &= ~(tt_req << L3C_TRACETAG_REQ_SHIFT); + val &= ~L3C_TRACETAG_REQ_EN; + writel(val, l3c_pmu->base + L3C_TRACETAG_CTRL); + + /* Disable request-tracetag statistics */ + val = readl(l3c_pmu->base + L3C_PERF_CTRL); + val &= ~L3C_TRACETAG_EN; + writel(val, l3c_pmu->base + L3C_PERF_CTRL); + } +} + +static void hisi_l3c_pmu_write_ds(struct perf_event *event, u32 ds_cfg) +{ + struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); + struct hw_perf_event *hwc = &event->hw; + u32 reg, reg_idx, shift, val; + int idx = hwc->idx; + + /* + * Select the appropriate datasource register(L3C_DATSRC_TYPE0/1). + * There are 2 datasource ctrl register for the 8 hardware counters. + * Datasrc is 8-bits and for the former 4 hardware counters, + * L3C_DATSRC_TYPE0 is chosen. For the latter 4 hardware counters, + * L3C_DATSRC_TYPE1 is chosen. + */ + reg = L3C_DATSRC_TYPE + (idx / 4) * 4; + reg_idx = idx % 4; + shift = 8 * reg_idx; + + val = readl(l3c_pmu->base + reg); + val &= ~(L3C_DATSRC_MASK << shift); + val |= ds_cfg << shift; + writel(val, l3c_pmu->base + reg); +} + +static void hisi_l3c_pmu_config_ds(struct perf_event *event) +{ + struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); + u32 ds_cfg = hisi_get_datasrc_cfg(event); + u32 ds_skt = hisi_get_datasrc_skt(event); + + if (ds_cfg) + hisi_l3c_pmu_write_ds(event, ds_cfg); + + if (ds_skt) { + u32 val; + + val = readl(l3c_pmu->base + L3C_DATSRC_CTRL); + val |= L3C_DATSRC_SKT_EN; + writel(val, l3c_pmu->base + L3C_DATSRC_CTRL); + } +} + +static void hisi_l3c_pmu_clear_ds(struct perf_event *event) +{ + struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); + u32 ds_cfg = hisi_get_datasrc_cfg(event); + u32 ds_skt = hisi_get_datasrc_skt(event); + + if (ds_cfg) + hisi_l3c_pmu_write_ds(event, L3C_DATSRC_NONE); + + if (ds_skt) { + u32 val; + + val = readl(l3c_pmu->base + L3C_DATSRC_CTRL); + val &= ~L3C_DATSRC_SKT_EN; + writel(val, l3c_pmu->base + L3C_DATSRC_CTRL); + } +} + +static void hisi_l3c_pmu_config_core_tracetag(struct perf_event *event) +{ + struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); + u32 core = hisi_get_tt_core(event); + + if (core) { + u32 val; + + /* Config and enable core information */ + writel(core, l3c_pmu->base + L3C_CORE_CTRL); + val = readl(l3c_pmu->base + L3C_PERF_CTRL); + val |= L3C_CORE_EN; + writel(val, l3c_pmu->base + L3C_PERF_CTRL); + + /* Enable core-tracetag statistics */ + val = readl(l3c_pmu->base + L3C_TRACETAG_CTRL); + val |= L3C_TRACETAG_CORE_EN; + writel(val, l3c_pmu->base + L3C_TRACETAG_CTRL); + } +} + +static void hisi_l3c_pmu_clear_core_tracetag(struct perf_event *event) +{ + struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); + u32 core = hisi_get_tt_core(event); + + if (core) { + u32 val; + + /* Clear core information */ + writel(L3C_COER_NONE, l3c_pmu->base + L3C_CORE_CTRL); + val = readl(l3c_pmu->base + L3C_PERF_CTRL); + val &= ~L3C_CORE_EN; + writel(val, l3c_pmu->base + L3C_PERF_CTRL); + + /* Disable core-tracetag statistics */ + val = readl(l3c_pmu->base + L3C_TRACETAG_CTRL); + val &= ~L3C_TRACETAG_CORE_EN; + writel(val, l3c_pmu->base + L3C_TRACETAG_CTRL); + } +} + +static void hisi_l3c_pmu_enable_filter(struct perf_event *event) +{ + if (event->attr.config1 != 0x0) { + hisi_l3c_pmu_config_req_tracetag(event); + hisi_l3c_pmu_config_core_tracetag(event); + hisi_l3c_pmu_config_ds(event); + } +} + +static void hisi_l3c_pmu_disable_filter(struct perf_event *event) +{ + if (event->attr.config1 != 0x0) { + hisi_l3c_pmu_clear_ds(event); + hisi_l3c_pmu_clear_core_tracetag(event); + hisi_l3c_pmu_clear_req_tracetag(event); + } +} /* * Select the counter register offset using the counter index @@ -49,29 +233,13 @@ static u32 hisi_l3c_pmu_get_counter_offset(int cntr_idx) static u64 hisi_l3c_pmu_read_counter(struct hisi_pmu *l3c_pmu, struct hw_perf_event *hwc) { - u32 idx = hwc->idx; - - if (!hisi_uncore_pmu_counter_valid(l3c_pmu, idx)) { - dev_err(l3c_pmu->dev, "Unsupported event index:%d!\n", idx); - return 0; - } - - /* Read 64-bits and the upper 16 bits are RAZ */ - return readq(l3c_pmu->base + hisi_l3c_pmu_get_counter_offset(idx)); + return readq(l3c_pmu->base + hisi_l3c_pmu_get_counter_offset(hwc->idx)); } static void hisi_l3c_pmu_write_counter(struct hisi_pmu *l3c_pmu, struct hw_perf_event *hwc, u64 val) { - u32 idx = hwc->idx; - - if (!hisi_uncore_pmu_counter_valid(l3c_pmu, idx)) { - dev_err(l3c_pmu->dev, "Unsupported event index:%d!\n", idx); - return; - } - - /* Write 64-bits and the upper 16 bits are WI */ - writeq(val, l3c_pmu->base + hisi_l3c_pmu_get_counter_offset(idx)); + writeq(val, l3c_pmu->base + hisi_l3c_pmu_get_counter_offset(hwc->idx)); } static void hisi_l3c_pmu_write_evtype(struct hisi_pmu *l3c_pmu, int idx, @@ -167,82 +335,27 @@ static void hisi_l3c_pmu_disable_counter_int(struct hisi_pmu *l3c_pmu, writel(val, l3c_pmu->base + L3C_INT_MASK); } -static irqreturn_t hisi_l3c_pmu_isr(int irq, void *dev_id) +static u32 hisi_l3c_pmu_get_int_status(struct hisi_pmu *l3c_pmu) { - struct hisi_pmu *l3c_pmu = dev_id; - struct perf_event *event; - unsigned long overflown; - int idx; - - /* Read L3C_INT_STATUS register */ - overflown = readl(l3c_pmu->base + L3C_INT_STATUS); - if (!overflown) - return IRQ_NONE; - - /* - * Find the counter index which overflowed if the bit was set - * and handle it. - */ - for_each_set_bit(idx, &overflown, L3C_NR_COUNTERS) { - /* Write 1 to clear the IRQ status flag */ - writel((1 << idx), l3c_pmu->base + L3C_INT_CLEAR); - - /* Get the corresponding event struct */ - event = l3c_pmu->pmu_events.hw_events[idx]; - if (!event) - continue; - - hisi_uncore_pmu_event_update(event); - hisi_uncore_pmu_set_event_period(event); - } - - return IRQ_HANDLED; + return readl(l3c_pmu->base + L3C_INT_STATUS); } -static int hisi_l3c_pmu_init_irq(struct hisi_pmu *l3c_pmu, - struct platform_device *pdev) +static void hisi_l3c_pmu_clear_int_status(struct hisi_pmu *l3c_pmu, int idx) { - int irq, ret; - - /* Read and init IRQ */ - irq = platform_get_irq(pdev, 0); - if (irq < 0) - return irq; - - ret = devm_request_irq(&pdev->dev, irq, hisi_l3c_pmu_isr, - IRQF_NOBALANCING | IRQF_NO_THREAD, - dev_name(&pdev->dev), l3c_pmu); - if (ret < 0) { - dev_err(&pdev->dev, - "Fail to request IRQ:%d ret:%d\n", irq, ret); - return ret; - } - - l3c_pmu->irq = irq; - - return 0; + writel(1 << idx, l3c_pmu->base + L3C_INT_CLEAR); } static const struct acpi_device_id hisi_l3c_pmu_acpi_match[] = { { "HISI0213", }, - {}, + { "HISI0214", }, + {} }; MODULE_DEVICE_TABLE(acpi, hisi_l3c_pmu_acpi_match); static int hisi_l3c_pmu_init_data(struct platform_device *pdev, struct hisi_pmu *l3c_pmu) { - unsigned long long id; struct resource *res; - acpi_status status; - - status = acpi_evaluate_integer(ACPI_HANDLE(&pdev->dev), - "_UID", NULL, &id); - if (ACPI_FAILURE(status)) - return -EINVAL; - - l3c_pmu->index_id = id; - /* * Use the SCCL_ID and CCL_ID to identify the L3C PMU, while * SCCL_ID is in MPIDR[aff2] and CCL_ID is in MPIDR[aff1]. @@ -266,20 +379,36 @@ static int hisi_l3c_pmu_init_data(struct platform_device *pdev, return PTR_ERR(l3c_pmu->base); } + l3c_pmu->identifier = readl(l3c_pmu->base + L3C_VERSION); + return 0; } -static struct attribute *hisi_l3c_pmu_format_attr[] = { +static struct attribute *hisi_l3c_pmu_v1_format_attr[] = { HISI_PMU_FORMAT_ATTR(event, "config:0-7"), NULL, }; -static const struct attribute_group hisi_l3c_pmu_format_group = { +static const struct attribute_group hisi_l3c_pmu_v1_format_group = { .name = "format", - .attrs = hisi_l3c_pmu_format_attr, + .attrs = hisi_l3c_pmu_v1_format_attr, }; -static struct attribute *hisi_l3c_pmu_events_attr[] = { +static struct attribute *hisi_l3c_pmu_v2_format_attr[] = { + HISI_PMU_FORMAT_ATTR(event, "config:0-7"), + HISI_PMU_FORMAT_ATTR(tt_core, "config1:0-7"), + HISI_PMU_FORMAT_ATTR(tt_req, "config1:8-10"), + HISI_PMU_FORMAT_ATTR(datasrc_cfg, "config1:11-15"), + HISI_PMU_FORMAT_ATTR(datasrc_skt, "config1:16"), + NULL +}; + +static const struct attribute_group hisi_l3c_pmu_v2_format_group = { + .name = "format", + .attrs = hisi_l3c_pmu_v2_format_attr, +}; + +static struct attribute *hisi_l3c_pmu_v1_events_attr[] = { HISI_PMU_EVENT_ATTR(rd_cpipe, 0x00), HISI_PMU_EVENT_ATTR(wr_cpipe, 0x01), HISI_PMU_EVENT_ATTR(rd_hit_cpipe, 0x02), @@ -296,9 +425,22 @@ static struct attribute *hisi_l3c_pmu_events_attr[] = { NULL, }; -static const struct attribute_group hisi_l3c_pmu_events_group = { +static const struct attribute_group hisi_l3c_pmu_v1_events_group = { .name = "events", - .attrs = hisi_l3c_pmu_events_attr, + .attrs = hisi_l3c_pmu_v1_events_attr, +}; + +static struct attribute *hisi_l3c_pmu_v2_events_attr[] = { + HISI_PMU_EVENT_ATTR(l3c_hit, 0x48), + HISI_PMU_EVENT_ATTR(cycles, 0x7f), + HISI_PMU_EVENT_ATTR(l3c_ref, 0xb8), + HISI_PMU_EVENT_ATTR(dat_access, 0xb9), + NULL +}; + +static const struct attribute_group hisi_l3c_pmu_v2_events_group = { + .name = "events", + .attrs = hisi_l3c_pmu_v2_events_attr, }; static DEVICE_ATTR(cpumask, 0444, hisi_cpumask_sysfs_show, NULL); @@ -312,13 +454,34 @@ static const struct attribute_group hisi_l3c_pmu_cpumask_attr_group = { .attrs = hisi_l3c_pmu_cpumask_attrs, }; -static const struct attribute_group *hisi_l3c_pmu_attr_groups[] = { - &hisi_l3c_pmu_format_group, - &hisi_l3c_pmu_events_group, +static struct device_attribute hisi_l3c_pmu_identifier_attr = + __ATTR(identifier, 0444, hisi_uncore_pmu_identifier_attr_show, NULL); + +static struct attribute *hisi_l3c_pmu_identifier_attrs[] = { + &hisi_l3c_pmu_identifier_attr.attr, + NULL +}; + +static struct attribute_group hisi_l3c_pmu_identifier_group = { + .attrs = hisi_l3c_pmu_identifier_attrs, +}; + +static const struct attribute_group *hisi_l3c_pmu_v1_attr_groups[] = { + &hisi_l3c_pmu_v1_format_group, + &hisi_l3c_pmu_v1_events_group, &hisi_l3c_pmu_cpumask_attr_group, + &hisi_l3c_pmu_identifier_group, NULL, }; +static const struct attribute_group *hisi_l3c_pmu_v2_attr_groups[] = { + &hisi_l3c_pmu_v2_format_group, + &hisi_l3c_pmu_v2_events_group, + &hisi_l3c_pmu_cpumask_attr_group, + &hisi_l3c_pmu_identifier_group, + NULL +}; + static const struct hisi_uncore_ops hisi_uncore_l3c_ops = { .write_evtype = hisi_l3c_pmu_write_evtype, .get_event_idx = hisi_uncore_pmu_get_event_idx, @@ -330,6 +493,10 @@ static const struct hisi_uncore_ops hisi_uncore_l3c_ops = { .disable_counter_int = hisi_l3c_pmu_disable_counter_int, .write_counter = hisi_l3c_pmu_write_counter, .read_counter = hisi_l3c_pmu_read_counter, + .get_int_status = hisi_l3c_pmu_get_int_status, + .clear_int_status = hisi_l3c_pmu_clear_int_status, + .enable_filter = hisi_l3c_pmu_enable_filter, + .disable_filter = hisi_l3c_pmu_disable_filter, }; static int hisi_l3c_pmu_dev_probe(struct platform_device *pdev, @@ -341,16 +508,24 @@ static int hisi_l3c_pmu_dev_probe(struct platform_device *pdev, if (ret) return ret; - ret = hisi_l3c_pmu_init_irq(l3c_pmu, pdev); + ret = hisi_uncore_pmu_init_irq(l3c_pmu, pdev); if (ret) return ret; + if (l3c_pmu->identifier >= HISI_PMU_V2) { + l3c_pmu->counter_bits = 64; + l3c_pmu->check_event = L3C_V2_NR_EVENTS; + l3c_pmu->pmu_events.attr_groups = hisi_l3c_pmu_v2_attr_groups; + } else { + l3c_pmu->counter_bits = 48; + l3c_pmu->check_event = L3C_V1_NR_EVENTS; + l3c_pmu->pmu_events.attr_groups = hisi_l3c_pmu_v1_attr_groups; + } + l3c_pmu->num_counters = L3C_NR_COUNTERS; - l3c_pmu->counter_bits = 48; l3c_pmu->ops = &hisi_uncore_l3c_ops; l3c_pmu->dev = &pdev->dev; l3c_pmu->on_cpu = -1; - l3c_pmu->check_event = 0x59; return 0; } @@ -371,6 +546,11 @@ static int hisi_l3c_pmu_probe(struct platform_device *pdev) if (ret) return ret; + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%u_l3c%u", + l3c_pmu->sccl_id, l3c_pmu->ccl_id); + if (!name) + return -ENOMEM; + ret = cpuhp_state_add_instance(CPUHP_AP_PERF_ARM_HISI_L3_ONLINE, &l3c_pmu->node); if (ret) { @@ -378,8 +558,6 @@ static int hisi_l3c_pmu_probe(struct platform_device *pdev) return ret; } - name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%u_l3c%u", - l3c_pmu->sccl_id, l3c_pmu->index_id); l3c_pmu->pmu = (struct pmu) { .name = name, .module = THIS_MODULE, @@ -392,15 +570,16 @@ static int hisi_l3c_pmu_probe(struct platform_device *pdev) .start = hisi_uncore_pmu_start, .stop = hisi_uncore_pmu_stop, .read = hisi_uncore_pmu_read, - .attr_groups = hisi_l3c_pmu_attr_groups, + .attr_groups = l3c_pmu->pmu_events.attr_groups, .capabilities = PERF_PMU_CAP_NO_EXCLUDE, }; ret = perf_pmu_register(&l3c_pmu->pmu, name, -1); if (ret) { dev_err(l3c_pmu->dev, "L3C PMU register failed!\n"); - cpuhp_state_remove_instance(CPUHP_AP_PERF_ARM_HISI_L3_ONLINE, - &l3c_pmu->node); + cpuhp_state_remove_instance_nocalls( + CPUHP_AP_PERF_ARM_HISI_L3_ONLINE, &l3c_pmu->node); + irq_set_affinity_hint(l3c_pmu->irq, NULL); } return ret; @@ -411,8 +590,9 @@ static int hisi_l3c_pmu_remove(struct platform_device *pdev) struct hisi_pmu *l3c_pmu = platform_get_drvdata(pdev); perf_pmu_unregister(&l3c_pmu->pmu); - cpuhp_state_remove_instance(CPUHP_AP_PERF_ARM_HISI_L3_ONLINE, - &l3c_pmu->node); + cpuhp_state_remove_instance_nocalls(CPUHP_AP_PERF_ARM_HISI_L3_ONLINE, + &l3c_pmu->node); + irq_set_affinity_hint(l3c_pmu->irq, NULL); return 0; } diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c b/drivers/perf/hisilicon/hisi_uncore_pmu.c index 79f76f8dda8e1bfe0901852737a5dc9c4c3deaf2..c7a62a87118356b78714b87a8274033d2ad486cc 100644 --- a/drivers/perf/hisilicon/hisi_uncore_pmu.c +++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c @@ -15,12 +15,13 @@ #include #include +#include #include #include "hisi_uncore_pmu.h" #define HISI_GET_EVENTID(ev) (ev->hw.config_base & 0xff) -#define HISI_MAX_PERIOD(nr) (BIT_ULL(nr) - 1) +#define HISI_MAX_PERIOD(nr) (GENMASK_ULL((nr) - 1, 0)) /* * PMU format attributes @@ -34,6 +35,7 @@ ssize_t hisi_format_sysfs_show(struct device *dev, return sprintf(buf, "%s\n", (char *)eattr->var); } +EXPORT_SYMBOL_GPL(hisi_format_sysfs_show); /* * PMU event attributes @@ -47,6 +49,7 @@ ssize_t hisi_event_sysfs_show(struct device *dev, return sprintf(page, "config=0x%lx\n", (unsigned long)eattr->var); } +EXPORT_SYMBOL_GPL(hisi_event_sysfs_show); /* * sysfs cpumask attributes. For uncore PMU, we only have a single CPU to show @@ -58,6 +61,7 @@ ssize_t hisi_cpumask_sysfs_show(struct device *dev, return sprintf(buf, "%d\n", hisi_pmu->on_cpu); } +EXPORT_SYMBOL_GPL(hisi_cpumask_sysfs_show); static bool hisi_validate_event_group(struct perf_event *event) { @@ -92,11 +96,6 @@ static bool hisi_validate_event_group(struct perf_event *event) return counters <= hisi_pmu->num_counters; } -int hisi_uncore_pmu_counter_valid(struct hisi_pmu *hisi_pmu, int idx) -{ - return idx >= 0 && idx < hisi_pmu->num_counters; -} - int hisi_uncore_pmu_get_event_idx(struct perf_event *event) { struct hisi_pmu *hisi_pmu = to_hisi_pmu(event->pmu); @@ -112,16 +111,76 @@ int hisi_uncore_pmu_get_event_idx(struct perf_event *event) return idx; } +EXPORT_SYMBOL_GPL(hisi_uncore_pmu_get_event_idx); + +ssize_t hisi_uncore_pmu_identifier_attr_show(struct device *dev, + struct device_attribute *attr, + char *page) +{ + struct hisi_pmu *hisi_pmu = to_hisi_pmu(dev_get_drvdata(dev)); + + return snprintf(page, PAGE_SIZE, "0x%08x\n", hisi_pmu->identifier); +} +EXPORT_SYMBOL_GPL(hisi_uncore_pmu_identifier_attr_show); static void hisi_uncore_pmu_clear_event_idx(struct hisi_pmu *hisi_pmu, int idx) { - if (!hisi_uncore_pmu_counter_valid(hisi_pmu, idx)) { - dev_err(hisi_pmu->dev, "Unsupported event index:%d!\n", idx); - return; + clear_bit(idx, hisi_pmu->pmu_events.used_mask); +} + +static irqreturn_t hisi_uncore_pmu_isr(int irq, void *data) +{ + struct hisi_pmu *hisi_pmu = data; + struct perf_event *event; + unsigned long overflown; + int idx; + + overflown = hisi_pmu->ops->get_int_status(hisi_pmu); + if (!overflown) + return IRQ_NONE; + + /* + * Find the counter index which overflowed if the bit was set + * and handle it. + */ + for_each_set_bit(idx, &overflown, hisi_pmu->num_counters) { + /* Write 1 to clear the IRQ status flag */ + hisi_pmu->ops->clear_int_status(hisi_pmu, idx); + /* Get the corresponding event struct */ + event = hisi_pmu->pmu_events.hw_events[idx]; + if (!event) + continue; + + hisi_uncore_pmu_event_update(event); + hisi_uncore_pmu_set_event_period(event); } - clear_bit(idx, hisi_pmu->pmu_events.used_mask); + return IRQ_HANDLED; +} + +int hisi_uncore_pmu_init_irq(struct hisi_pmu *hisi_pmu, + struct platform_device *pdev) +{ + int irq, ret; + + irq = platform_get_irq(pdev, 0); + if (irq < 0) + return irq; + + ret = devm_request_irq(&pdev->dev, irq, hisi_uncore_pmu_isr, + IRQF_NOBALANCING | IRQF_NO_THREAD, + dev_name(&pdev->dev), hisi_pmu); + if (ret < 0) { + dev_err(&pdev->dev, + "Fail to request IRQ: %d ret: %d.\n", irq, ret); + return ret; + } + + hisi_pmu->irq = irq; + + return 0; } +EXPORT_SYMBOL_GPL(hisi_uncore_pmu_init_irq); int hisi_uncore_pmu_event_init(struct perf_event *event) { @@ -172,6 +231,7 @@ int hisi_uncore_pmu_event_init(struct perf_event *event) return 0; } +EXPORT_SYMBOL_GPL(hisi_uncore_pmu_event_init); /* * Set the counter to count the event that we're interested in, @@ -185,6 +245,9 @@ static void hisi_uncore_pmu_enable_event(struct perf_event *event) hisi_pmu->ops->write_evtype(hisi_pmu, hwc->idx, HISI_GET_EVENTID(event)); + if (hisi_pmu->ops->enable_filter) + hisi_pmu->ops->enable_filter(event); + hisi_pmu->ops->enable_counter_int(hisi_pmu, hwc); hisi_pmu->ops->enable_counter(hisi_pmu, hwc); } @@ -199,6 +262,9 @@ static void hisi_uncore_pmu_disable_event(struct perf_event *event) hisi_pmu->ops->disable_counter(hisi_pmu, hwc); hisi_pmu->ops->disable_counter_int(hisi_pmu, hwc); + + if (hisi_pmu->ops->disable_filter) + hisi_pmu->ops->disable_filter(event); } void hisi_uncore_pmu_set_event_period(struct perf_event *event) @@ -219,6 +285,7 @@ void hisi_uncore_pmu_set_event_period(struct perf_event *event) /* Write start value to the hardware event counter */ hisi_pmu->ops->write_counter(hisi_pmu, hwc, val); } +EXPORT_SYMBOL_GPL(hisi_uncore_pmu_set_event_period); void hisi_uncore_pmu_event_update(struct perf_event *event) { @@ -239,6 +306,7 @@ void hisi_uncore_pmu_event_update(struct perf_event *event) HISI_MAX_PERIOD(hisi_pmu->counter_bits); local64_add(delta, &event->count); } +EXPORT_SYMBOL_GPL(hisi_uncore_pmu_event_update); void hisi_uncore_pmu_start(struct perf_event *event, int flags) { @@ -261,6 +329,7 @@ void hisi_uncore_pmu_start(struct perf_event *event, int flags) hisi_uncore_pmu_enable_event(event); perf_event_update_userpage(event); } +EXPORT_SYMBOL_GPL(hisi_uncore_pmu_start); void hisi_uncore_pmu_stop(struct perf_event *event, int flags) { @@ -277,6 +346,7 @@ void hisi_uncore_pmu_stop(struct perf_event *event, int flags) hisi_uncore_pmu_event_update(event); hwc->state |= PERF_HES_UPTODATE; } +EXPORT_SYMBOL_GPL(hisi_uncore_pmu_stop); int hisi_uncore_pmu_add(struct perf_event *event, int flags) { @@ -299,6 +369,7 @@ int hisi_uncore_pmu_add(struct perf_event *event, int flags) return 0; } +EXPORT_SYMBOL_GPL(hisi_uncore_pmu_add); void hisi_uncore_pmu_del(struct perf_event *event, int flags) { @@ -310,12 +381,14 @@ void hisi_uncore_pmu_del(struct perf_event *event, int flags) perf_event_update_userpage(event); hisi_pmu->pmu_events.hw_events[hwc->idx] = NULL; } +EXPORT_SYMBOL_GPL(hisi_uncore_pmu_del); void hisi_uncore_pmu_read(struct perf_event *event) { /* Read hardware counter and update the perf counter statistics */ hisi_uncore_pmu_event_update(event); } +EXPORT_SYMBOL_GPL(hisi_uncore_pmu_read); void hisi_uncore_pmu_enable(struct pmu *pmu) { @@ -328,6 +401,7 @@ void hisi_uncore_pmu_enable(struct pmu *pmu) hisi_pmu->ops->start_counters(hisi_pmu); } +EXPORT_SYMBOL_GPL(hisi_uncore_pmu_enable); void hisi_uncore_pmu_disable(struct pmu *pmu) { @@ -335,30 +409,46 @@ void hisi_uncore_pmu_disable(struct pmu *pmu) hisi_pmu->ops->stop_counters(hisi_pmu); } +EXPORT_SYMBOL_GPL(hisi_uncore_pmu_disable); + /* - * Read Super CPU cluster and CPU cluster ID from MPIDR_EL1. - * If multi-threading is supported, CCL_ID is the low 3-bits in MPIDR[Aff2] - * and SCCL_ID is the upper 5-bits of Aff2 field; if not, SCCL_ID - * is in MPIDR[Aff2] and CCL_ID is in MPIDR[Aff1]. + * The Super CPU Cluster (SCCL) and CPU Cluster (CCL) IDs can be + * determined from the MPIDR_EL1, but the encoding varies by CPU: + * + * - For MT variants of TSV110: + * SCCL is Aff2[7:3], CCL is Aff2[2:0] + * + * - For other MT parts: + * SCCL is Aff3[7:0], CCL is Aff2[7:0] + * + * - For non-MT parts: + * SCCL is Aff2[7:0], CCL is Aff1[7:0] */ -static void hisi_read_sccl_and_ccl_id(int *sccl_id, int *ccl_id) +static void hisi_read_sccl_and_ccl_id(int *scclp, int *cclp) { u64 mpidr = read_cpuid_mpidr(); - - if (mpidr & MPIDR_MT_BITMASK) { - int aff2 = MPIDR_AFFINITY_LEVEL(mpidr, 2); - - if (sccl_id) - *sccl_id = aff2 >> 3; - if (ccl_id) - *ccl_id = aff2 & 0x7; + int aff3 = MPIDR_AFFINITY_LEVEL(mpidr, 3); + int aff2 = MPIDR_AFFINITY_LEVEL(mpidr, 2); + int aff1 = MPIDR_AFFINITY_LEVEL(mpidr, 1); + bool mt = mpidr & MPIDR_MT_BITMASK; + int sccl, ccl; + + if (mt && read_cpuid_part_number() == HISI_CPU_PART_TSV110) { + sccl = aff2 >> 3; + ccl = aff2 & 0x7; + } else if (mt) { + sccl = aff3; + ccl = aff2; } else { - if (sccl_id) - *sccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 2); - if (ccl_id) - *ccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 1); + sccl = aff2; + ccl = aff1; } + + if (scclp) + *scclp = sccl; + if (cclp) + *cclp = ccl; } /* @@ -398,10 +488,11 @@ int hisi_uncore_pmu_online_cpu(unsigned int cpu, struct hlist_node *node) hisi_pmu->on_cpu = cpu; /* Overflow interrupt also should use the same CPU */ - WARN_ON(irq_set_affinity(hisi_pmu->irq, cpumask_of(cpu))); + WARN_ON(irq_set_affinity_hint(hisi_pmu->irq, cpumask_of(cpu))); return 0; } +EXPORT_SYMBOL_GPL(hisi_uncore_pmu_online_cpu); int hisi_uncore_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node) { @@ -430,7 +521,10 @@ int hisi_uncore_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node) perf_pmu_migrate_context(&hisi_pmu->pmu, cpu, target); /* Use this CPU for event counting */ hisi_pmu->on_cpu = target; - WARN_ON(irq_set_affinity(hisi_pmu->irq, cpumask_of(target))); + WARN_ON(irq_set_affinity_hint(hisi_pmu->irq, cpumask_of(target))); return 0; } +EXPORT_SYMBOL_GPL(hisi_uncore_pmu_offline_cpu); + +MODULE_LICENSE("GPL v2"); diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.h b/drivers/perf/hisilicon/hisi_uncore_pmu.h index 25b0c97b3eb0a597d9159bfdbc4594229e0ca5bc..78c00d2d0ee677d23985d4490b1333917d116f44 100644 --- a/drivers/perf/hisilicon/hisi_uncore_pmu.h +++ b/drivers/perf/hisilicon/hisi_uncore_pmu.h @@ -11,15 +11,18 @@ #ifndef __HISI_UNCORE_PMU_H__ #define __HISI_UNCORE_PMU_H__ +#include #include #include #include #include +#include #include #undef pr_fmt #define pr_fmt(fmt) "hisi_pmu: " fmt +#define HISI_PMU_V2 0x30 #define HISI_MAX_COUNTERS 0x10 #define to_hisi_pmu(p) (container_of(p, struct hisi_pmu, pmu)) @@ -33,6 +36,12 @@ #define HISI_PMU_EVENT_ATTR(_name, _config) \ HISI_PMU_ATTR(_name, hisi_event_sysfs_show, (unsigned long)_config) +#define HISI_PMU_EVENT_ATTR_EXTRACTOR(name, config, hi, lo) \ + static inline u32 hisi_get_##name(struct perf_event *event) \ + { \ + return FIELD_GET(GENMASK_ULL(hi, lo), event->attr.config); \ + } + struct hisi_pmu; struct hisi_uncore_ops { @@ -46,11 +55,16 @@ struct hisi_uncore_ops { void (*disable_counter_int)(struct hisi_pmu *, struct hw_perf_event *); void (*start_counters)(struct hisi_pmu *); void (*stop_counters)(struct hisi_pmu *); + u32 (*get_int_status)(struct hisi_pmu *hisi_pmu); + void (*clear_int_status)(struct hisi_pmu *hisi_pmu, int idx); + void (*enable_filter)(struct perf_event *event); + void (*disable_filter)(struct perf_event *event); }; struct hisi_pmu_hwevents { struct perf_event *hw_events[HISI_MAX_COUNTERS]; DECLARE_BITMAP(used_mask, HISI_MAX_COUNTERS); + const struct attribute_group **attr_groups; }; /* Generic pmu struct for different pmu types */ @@ -70,13 +84,15 @@ struct hisi_pmu { void __iomem *base; /* the ID of the PMU modules */ u32 index_id; + /* For DDRC PMU v2: each DDRC has more than one DMC */ + u32 sub_id; int num_counters; int counter_bits; /* check event code range */ int check_event; + u32 identifier; }; -int hisi_uncore_pmu_counter_valid(struct hisi_pmu *hisi_pmu, int idx); int hisi_uncore_pmu_get_event_idx(struct perf_event *event); void hisi_uncore_pmu_read(struct perf_event *event); int hisi_uncore_pmu_add(struct perf_event *event, int flags); @@ -96,4 +112,11 @@ ssize_t hisi_cpumask_sysfs_show(struct device *dev, struct device_attribute *attr, char *buf); int hisi_uncore_pmu_online_cpu(unsigned int cpu, struct hlist_node *node); int hisi_uncore_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node); + +ssize_t hisi_uncore_pmu_identifier_attr_show(struct device *dev, + struct device_attribute *attr, + char *page); +int hisi_uncore_pmu_init_irq(struct hisi_pmu *hisi_pmu, + struct platform_device *pdev); + #endif /* __HISI_UNCORE_PMU_H__ */ diff --git a/include/linux/acpi_iort.h b/include/linux/acpi_iort.h index 8e7e2ec37f1b295f2615b92ea253f6f9f1f09025..64f700254ca0ff36f1268e20bcf1eeded3a32dbb 100644 --- a/include/linux/acpi_iort.h +++ b/include/linux/acpi_iort.h @@ -21,6 +21,7 @@ */ #define IORT_SMMU_V3_PMCG_GENERIC 0x00000000 /* Generic SMMUv3 PMCG */ #define IORT_SMMU_V3_PMCG_HISI_HIP08 0x00000001 /* HiSilicon HIP08 PMCG */ +#define IORT_SMMU_V3_PMCG_HISI_HIP09 0x00000002 /* HiSilicon HIP09 PMCG */ int iort_register_domain_token(int trans_id, phys_addr_t base, struct fwnode_handle *fw_node); diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 2d55cee638fc6071b3e523a75ed569f7acec0a3f..2bdc493f759496752a10dfe4d6f6b167242d63f1 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -165,6 +165,9 @@ enum cpuhp_state { CPUHP_AP_PERF_ARM_HISI_DDRC_ONLINE, CPUHP_AP_PERF_ARM_HISI_HHA_ONLINE, CPUHP_AP_PERF_ARM_HISI_L3_ONLINE, + #ifndef __GENKSYMS__ + CPUHP_AP_PERF_ARM_HISI_PCIE_PMU_ONLINE, + #endif CPUHP_AP_PERF_ARM_L2X0_ONLINE, CPUHP_AP_PERF_ARM_QCOM_L2_ONLINE, CPUHP_AP_PERF_ARM_QCOM_L3_ONLINE, diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h index cf34dbaf2c114fce63f3d30becdb3aef4ffcddf3..569f87d699ccf781fcb9bff473764832984c6080 100644 --- a/include/linux/interrupt.h +++ b/include/linux/interrupt.h @@ -301,39 +301,8 @@ struct irq_affinity_desc { extern cpumask_var_t irq_default_affinity; -/* Internal implementation. Use the helpers below */ -extern int __irq_set_affinity(unsigned int irq, const struct cpumask *cpumask, - bool force); - -/** - * irq_set_affinity - Set the irq affinity of a given irq - * @irq: Interrupt to set affinity - * @cpumask: cpumask - * - * Fails if cpumask does not contain an online CPU - */ -static inline int -irq_set_affinity(unsigned int irq, const struct cpumask *cpumask) -{ - return __irq_set_affinity(irq, cpumask, false); -} - -/** - * irq_force_affinity - Force the irq affinity of a given irq - * @irq: Interrupt to set affinity - * @cpumask: cpumask - * - * Same as irq_set_affinity, but without checking the mask against - * online cpus. - * - * Solely for low level cpu hotplug code, where we need to make per - * cpu interrupts affine before the cpu becomes online. - */ -static inline int -irq_force_affinity(unsigned int irq, const struct cpumask *cpumask) -{ - return __irq_set_affinity(irq, cpumask, true); -} +extern int irq_set_affinity(unsigned int irq, const struct cpumask *cpumask); +extern int irq_force_affinity(unsigned int irq, const struct cpumask *cpumask); extern int irq_can_set_affinity(unsigned int irq); extern int irq_select_affinity(unsigned int irq); diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c index 918fe0593386219c9e3449bf82f391086c2fd3b5..39cee1fea51ecc353a5e62bddb35b84004c2afff 100644 --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -332,7 +332,8 @@ int irq_set_affinity_locked(struct irq_data *data, const struct cpumask *mask, return ret; } -int __irq_set_affinity(unsigned int irq, const struct cpumask *mask, bool force) +static int __irq_set_affinity(unsigned int irq, const struct cpumask *mask, + bool force) { struct irq_desc *desc = irq_to_desc(irq); unsigned long flags; @@ -347,6 +348,36 @@ int __irq_set_affinity(unsigned int irq, const struct cpumask *mask, bool force) return ret; } +/** + * irq_set_affinity - Set the irq affinity of a given irq + * @irq: Interrupt to set affinity + * @cpumask: cpumask + * + * Fails if cpumask does not contain an online CPU + */ +int irq_set_affinity(unsigned int irq, const struct cpumask *cpumask) +{ + return __irq_set_affinity(irq, cpumask, false); +} +EXPORT_SYMBOL_GPL(irq_set_affinity); + +/** + * irq_force_affinity - Force the irq affinity of a given irq + * @irq: Interrupt to set affinity + * @cpumask: cpumask + * + * Same as irq_set_affinity, but without checking the mask against + * online cpus. + * + * Solely for low level cpu hotplug code, where we need to make per + * cpu interrupts affine before the cpu becomes online. + */ +int irq_force_affinity(unsigned int irq, const struct cpumask *cpumask) +{ + return __irq_set_affinity(irq, cpumask, true); +} +EXPORT_SYMBOL_GPL(irq_force_affinity); + int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m) { unsigned long flags;