soc30 / lms
connected
dashboard
WEEK_2 · DAY_09 · 1 HOUR

Data Onboarding & Parsing

props.conf, transforms.conf, sourcetypes, time extraction, field extraction

Splunk Lab

Learning Objectives

  • Onboard a new data source end-to-end
  • Configure props.conf for line-breaking, timestamps, sourcetypes
  • Use transforms.conf for routing, masking, index-time fields
  • Validate parsing with btool and the data preview UI

Module 1 — Sourcetypes — The First Decision

Sourcetype tells Splunk how to parse a feed. Use Splunk-supplied where possible (e.g. cisco:asa, linux_secure, ms:o365:management).

Custom sourcetypes go in $SPLUNK_HOME/etc/apps/<your-app>/local/props.conf.

Module 2 — props.conf — Parsing Bible

TIME_PREFIX, TIME_FORMAT, MAX_TIMESTAMP_LOOKAHEAD — get timestamps right or every dashboard lies.

LINE_BREAKER, SHOULD_LINEMERGE = false — multi-line vs single-line.

EXTRACT-* / REPORT-* — search-time field extractions.

Module 3 — transforms.conf — Routing & Masking

TRANSFORMS-* = index-time. SEDCMD = simple regex masking (CC numbers, secrets).

Routing: send specific events to a different index or drop them entirely.

Module 4 — Validation Tools

Data Preview UI (Settings → Add Data) for interactive testing.

btool: $SPLUNK_HOME/bin/splunk btool props list <sourcetype> --debug to see what config wins.

SPL Queries

Find a misconfigured sourcetype
index=* sourcetype=too_small earliest=-1h
| stats count by host, source
// too_small = Splunk couldn't determine sourcetype — fix props.conf.

Lab 9 — Onboard a New Source

  1. Imagine Cisco ASA firewall logs are arriving with sourcetype=too_small.
  2. Write props.conf: set sourcetype, TIME_FORMAT, LINE_BREAKER.
  3. Add transforms.conf to mask credit-card numbers (SEDCMD).
  4. Validate with btool and a sample search.
Launch Lab Workbench

Key Takeaways

  • Get parsing right at onboarding — fixing it later is 10x harder
  • props/transforms is where 80% of admin time lives
  • btool is your debugging superpower