Operating Profit Indicators Using Robust Regression

Operating profit indicators such as the operating profit margin (μ), defined as the quotient of operating profits to net sales revenue, can vary between enterprises in the same industry:

(1) μ(i) = { S(i) – C(i) } / S(i)

where S denotes net sales revenue, C denotes cost and expenses, and the index i =1, 2, 3, …, N observations (sample size).

Solve eq. (1) for S, and obtain the “return on cost” (ROC) equation in which sales revenue is proportional to supply cost:

(2). S(i) = λ(i) C(i)

where each quotient λ(i) = 1 / (1 − μ(i)) is the operating profit markup per enterprise.

The operating profit markup per product or per service is a romantic ideal deprived of accounting reality because enterprises used as comparables (peer group) to the tested party are multi-product entities reporting to the public consolidated group accounts. E.g., COGS, OPEX (aka XSGA), Depreciation (DFXA), or Amortization (AM) are reported in aggregate (multi-product) accounts; they are not reported per product.

The price markup over cost eq. (2) is vintage shared by many economists, including Adam Smith (1776), David Ricardo (1818), Augustin Cournot (1838), Karl Marx (1867), Piero Sraffa (1926), Abba Lerner (1934), and Michael Kalecki (1943).

If λ = λ(1) = λ(2) = … = λ(N), which is a standard convergence assumption, we can use ordinary least squares (OLS) or robust regression algorithms to calculate the expected (uniform) slope coefficient or operating profit markup for the combined group or the individual N enterprises:

(3) S(i) = λ C(i) + U(i)

Random uncertainty (U) is added to eq. (2) to morph into the stochastic eq. (3) because the functional relationship between sales revenue and total cost is not exact. See Hacking (2006) for the origins of probabilistic versus deterministic conceptions of nature and society.

We can obtain the profit margin per enterprise, or per comparable group of enterprises, from the price markup eq. (3) by indirect least squares (ILS), because the operating profit markup and the operating profit margin are related by the coefficient equations:

(4) λ = 1 / (1 – μ) implies μ = (λ – 1) / λ for λ > 1

Var(U(i)) = σ² is assumed to be a constant between the N enterprises in the sample. However, this equal variance assumption is troubling in the presence of outliers. Ordinary least squares (OLS) produces unreliable estimates of the regression coefficients and their standard errors if the sample is contaminated with outliers.

In RoyaltyStat, we built a Robust Regression module and below we show the Huber regression results resilient to outliers on Exhibits 1, 2 and 3. See Lawrence & Arthur (1990), Chapter 13 (A Comparison of Regression Estimators), Staudte & Sheather (1990), Chapter 7 (Regression), or Huber & Ronchetti (2009), Chapter 7 (Regression).

We estimate eq. (3) defining Total Cost in two accounting variants:

First, Total Cost (Lato) = (COGS + XSGA) + (DP – AM), where COGS is cost of goods sold, XSGA is operating (selling, general & administration) expenses, DP is the sum of the depreciation of tangible assets (DFXA) and the amortization of acquired intangibles (AM). In Standard & Poor’s Compustat mnemonics, DP = DFXA + AM.

Plugged into eq. (3), Total Cost (Lato) can produce an operating profit markup after “a reasonable allowance for depreciation and amortization.” However, reported DP is a dirty accounting number, especially when it includes impairment charges. A more reliable alternative is to exclude DP before computing operating profit indicators, and then allow a reasonable deduction for depreciation before making a transfer pricing adjustment.

Second, Total Cost (Stricto) = (COGS + XSGA), excluding DFXA and AM. Exhibits 1 and 2 show that the reported allowance for depreciation (even after excluding amortization) may not be reasonable, or at least we can observe substantial DFXA variations among enterprises in the same industry. Exhibit 3 compares the visual results of the two prior exhibits 1 and 2.

Operating profit indicators such as the operating profit markups, or the ILS derived operating profit margins, obtained from applying robust regression algorithms are reliable, enabling the computation of efficient confidence intervals for the slope coefficients (reliable ranges of profit indicators).

We can infer two useful findings from our empirical illustration using annual financials from big oil multinational enterprises (MNE): (i) the operating profit markups vary among peer MNE in the sample; also (ii) reported DFXA accounts show substantial variations among the sampled enterprises.

From these two empirical revelations, we can infer further that (like the precedent set in thin-capitalization rules) the most reliable comparable for the controlled tested party may be the consolidated group’s (after intra-group account eliminations) operating profit indicator. Otherwise, the operating profit markups (or the ILS derived operating profit margins) may be more reliable if they are computed before DP (not verified in Exhibits 1, 2, or 3), or else if sound economics selected operating profit indicators are computed using robust regression methods.

References

Ian Hacking, The Emergence of Probability (2nd edition), Cambridge University Press, 2006. Hacking traces the shift in the dominant paradigm from deterministic to probabilistic (or stochastic) conceptions of nature.

Peter Huber & Elvezio Ronchetti, Robust Statistics (2nd edition), Wiley, 2009.

Kenneth Lawrence & Jeffrey Arthur (editors), Robust Regression, Marcel Dekker, 1990. Like the R library, Statsmodel in Python includes several algorithms to compute robust statistics. R and Statsmodel contain richer statistical algorithms compared to the IBM Scientific Subroutines that I used at the University of California at Berkeley in 1979-1980 to program in Fortran my Ph.D. (Econ.) dissertation’s survey data. I operated a screenless IBM 1130 single-user computer system combined with a magnetic tape deck and keypunch machine.

Robert Staudte & Simon Sheather, Robust Estimation and Testing, Wiley, 1990. The authors of this textbook use Minitab Macros to supplement internal statistics functions. We built-in a more agile Huber robust regression algorithm in RoyaltyStat.

Exhibit 1: Sales v. Total Cost (Lato), Robust Regression

GVKEY	Company Name	Count	Slope Coef.	Std Err.	t-Stat	R² (%)	Intercept at 5%
2410	BP PLC	59	1.068	0.003	317.842	99.640	Significant
2991	Chevron Corp	62	1.131	0.004	324.640	99.380	Significant
8549	Conocophillips	72	1.123	0.002	532.872	99.700	Insignificant
61616	Eni SpA	33	1.166	0.033	35.501	98.100	Insignificant
4503	Exxon Mobil Corp	72	1.116	0.004	264.331	99.410	Insignificant
7017	Marathon Oil Corp	66	1.084	0.004	278.778	99.520	Insignificant
12384	Shell Plc	40	1.079	0.008	127.807	99.490	Insignificant
24625	TotalEnergies SE	33	1.159	0.014	85.565	99.370	Insignificant
15247	Valero Energy Corp	43	1.039	0.001	1114.045	99.920	Insignificant
	All	493	1.099	0.001	912.865	99.520	Insignificant

Exhibit 2: Sales v. Total Cost (Stricto), Robust Regression

GVKEY	Company Name	Count	Slope Coef.	Std Err.	t-Stat	R²(%)	Intercept at 5%
2410	BP PLC	59	1.113	0.0033	332.836	99.710	Significant
2991	Chevron Corp	62	1.222	0.0051	240.659	99.780	Significant
8549	Conocophillips	72	1.176	0.0020	593.169	99.430	Significant
61616	Eni SpA	33	1.284	0.0337	38.152	98.310	Insignificant
4503	Exxon Mobil Corp	72	1.177	0.0042	282.376	99.670	Insignificant
7017	Marathon Oil Corp	66	1.126	0.0048	232.167	99.080	Significant
12384	Shell Plc	40	1.118	0.0083	135.054	99.670	Significant
24625	TotalEnergies SE	33	1.240	0.0079	156.418	99.640	Significant
15247	Valero Energy Corp	43	1.050	0.0019	559.532	99.920	Insignificant
	All	493	1.162	0.0015	768.774	99.500	Significant