Analyzing TI feeds for overlap, novelty and aging

Analyzing TI feeds for overlap, novelty and aging

Nota importada desde Inbox durante consolidacion bulk.

Resumen

Blog post that evaluates 16 freely available threat intelligence feeds from three analytical perspectives: overlap (shared indicators between feeds), novelty (rate of new vs removed indicators), and aging (indicator persistence over time). Uses Combine for data gathering and tiq-test for analysis in the R programming environment.

Contenido

Tools Used

  • Combine (GitHub) - TI feed data gathering tool, stores data in tiq-test compatible format
  • tiq-test (GitHub) - Test suite for evaluating TI feeds, runs in R programming environment

Gathering the Data

  • Focus on feeds containing IP addresses as indicators of compromise (IoC)
  • Combine gathers approximately 18MB of GZipped CSV data per day
  • Data gathering step takes about 10 minutes on a typical desktop computer
  • Tests from tiq-test suite run against this dataset in R

Feed Overlap Test

  • Execution time: ~20 seconds
  • Measures which portion of a feed is contained in another feed
  • Results in graphical matrix form with numerical values in R variables
  • Key finding: All but one feed are quite unique in their IP address content
  • Similar overlap analysis available from MISP at misp.software/feeds

Feed Novelty Test

  • Execution time: ~2 minutes
  • Depicts ratio of IP addresses added and removed per day
  • Key insight: High quality feeds update their content more frequently than lower quality feeds
  • Caveat: Highly dependent on feed type; a feed may contain highly relevant data that doesn't update often
  • One feed failed the test due to URL redirecting in data gathering stage

Feed Aging Test

  • Execution time: ~2 minutes
  • Analyzes feeds in terms of indicator repetition throughout the time interval
  • Aging = number of times an indicator is repeated on a feed
  • One feed failed for the same URL redirect reason as novelty test

Analisis

  • Overlap, novelty and aging tests are feasible first steps towards repeatable analysis of TI feeds
  • Interpretation of results may not be as straightforward as initially expected
  • Feed uniqueness is generally high, suggesting minimal redundancy across free feeds
  • Feed update frequency correlates loosely with quality but depends heavily on feed type and purpose

Puntos Clave

  • 16 free TI feeds evaluated with quantitative methodology
  • Most feeds show low overlap, indicating unique value per feed
  • Novelty rate is a useful but imperfect quality proxy
  • Aging analysis reveals indicator persistence patterns
  • URL redirect issues can cause test failures in automated pipelines

Referencias