Skip to content

Data Collection During Storm Chasing 101

Eros Marcello edited this page Mar 26, 2023 · 1 revision

Data Collection During Storm Chasing 101

To effectively collect, organize, and optimize data during storm chases for our machine learning model, follow these steps:

Identify Relevant Data

Determine the data variables that are important for predicting tornadic activity and behavior. Some potential variables include temperature, humidity, wind speed, wind direction, barometric pressure, and radar-derived variables such as reflectivity, storm-relative velocity, and spectrum width.

Use Appropriate Sensors and Tools

Invest in reliable sensors and tools to collect accurate and precise data during storm chases. Some useful tools are:

  • A portable weather station for measuring temperature, humidity, wind speed, and wind direction
  • A barometer to measure atmospheric pressure
  • A mobile radar system or access to radar data from nearby National Weather Service (NWS) radar sites
  • GPS devices to record the location and time of each data point

Standardize Data Collection

Create a consistent method for data collection during storm chases. Ensure that all team members follow the same protocol for data gathering, including how frequently data is collected, how measurements are taken, and how data is stored.

Organize the Data

Store the collected data in a structured format such as a CSV file or a database, with each row representing a data point and each column representing a variable (e.g., time, location, temperature, humidity, etc.). Make sure to include appropriate headers for each column and consistently use units of measurement throughout the dataset.

Clean and Preprocess the Data

After collecting the data, clean and preprocess it to prepare it for the machine learning model. This process may involve:

Handling Missing or Inconsistent Data

  • Converting units of measurement, if necessary
  • Filtering out noise or outliers
  • Standardizing or normalizing the data

Feature Engineering

Extract relevant features from the collected data, create new features if needed, and select the most important features for your machine learning model. For example, you could calculate derived variables such as dew point, lapse rates, or helicity, which may be important for predicting tornadic activity.

Integrate External Data

Consider integrating external data sources, such as SPC severe weather outlooks, to complement the data collected during storm chases. This additional data can help improve the model's performance and provide valuable context for understanding tornadic activity.

Optimize Data Collection

Continuously review and optimize your data collection process. Analyze the performance of your machine learning model and identify areas where additional data or improved data quality could lead to better predictions. Adjust your data collection methods accordingly.

By following these steps, we can effectively collect, organize, and optimize the data during storm chases to build a robust machine learning model for our purposes of predicting tornadic activity and behavior.