Open Payments Risk Methodology & Limitations

State Risk Radar is a change-detection layer on top of CMS Open Payments data. It compares the latest 30-day activity with the previous 30-day window and produces transparent statistical signals for providers, companies, and states.

How to read this methodology

  • The model looks for unusual changes, not absolute "good" or "bad" values.
  • All metrics are calculated on rolling 30-day windows for consistency.
  • Higher score means stronger anomaly pattern, not legal wrongdoing.

Formula (v1, transparent weights)

Risk score is a weighted combination of four components:

  • 40% amount growth vs previous 30 days
  • 20% payment count growth vs previous 30 days
  • 25% payer concentration (share of top company)
  • 15% category mix shift (general/research/ownership)
Example expression: score = 0.40 * amount_growth + 0.20 * payments_growth + 0.25 * concentration + 0.15 * category_shift

Signal tags are generated from threshold rules:Rapid growth (rapid_growth), Concentration spike (concentration_spike), Category shift (category_shift), New high-amount entrant (new_high_amount_entrant).

Data windows and definitions

  • Current window (30d): latest 30 days ending at as_of_date.
  • Previous window (prev 30d): the 30 days immediately before current window.
  • Growth: relative change between current and previous windows.
  • State views: provider signals rolled up by provider state mapping.

Primary data sources

  • CMS Open Payments public datasets (openpaymentsdata.cms.gov)
  • NPI/provider identity and address mapping fields in the usnpi pipeline
  • Internal daily aggregation tables generated by parser jobs

Legal & compliance notice

  • Scores and signals are informational statistical indicators only.
  • They do not establish fraud, misconduct, conflict of interest, or legal liability.
  • No medical, legal, or compliance advice is provided on this page.
  • Users should validate findings with primary records and domain experts.

Known limitations

  • Data quality depends on source reporting completeness and timeliness.
  • Entity matching can be affected by naming/address normalization differences.
  • Small baselines may produce large percentage growth values.
  • Scores are sensitive to short-term windows and may fluctuate day to day.