read_pandora

A small set of functions for converting a Pandora file to a dataframe.

The main function in this module is read_pandora:

from uptrop.read_pandora import read_pandora
location, pandora_df = read_pandora("pandora_file", "Tot")
pandora_df.plot('day', 'no2col')

The rest are ancillary functions for reading pandora data files.

uptrop.read_pandora.get_column_description_index(file_path)

Returns a dictionary of {description:column index} for a pandora file

See https://regex101.com/r/gAjFtL/1 for an in-depth explanation of the regex used

Returns two groups; group 1 is the column number, group 2 is the description.

Parameters

file_path (str) – The path to the pandora data

Returns

The dictionary of {description:column index} suitable for passing to get_column_from_description

Return type

dict{str:int}

uptrop.read_pandora.get_column_from_description(column_dict, description)

Searched through the output from get_column_description_index for a given description

Parameters
  • column_dict (dict) – The output from get_column_description_index

  • description – A substring of the column description you want to find the index for

Returns

The first index corresponding to description

Return type

int

uptrop.read_pandora.get_lat_lon(pandora_filepath)

Returns a dictionary of lat, lon extracted from the pandora file

Parameters

pandora_filepath (str) – The pandora file

Returns

A dict of {“lat”:lat, “lon”:lon}

Return type

dict{str:int}

uptrop.read_pandora.get_start_of_data(file_path)

Gets the line number of the start of the data itself

Inspects the file line-by-line, counting lines until it finds the second dotted line

Parameters

file_path (str) – Path to the Pandora file

Returns

The 1-indexed line number of the start of the data

Return type

int

uptrop.read_pandora.read_pandora(pandora_filepath, no2col)

Reads position and data from a Pandora file

Returns two values: a dictionary of position and a dataframe of Pandora data

Pandora data can be either total column or troposphere column

The dictionary has key {‘lat,’lon’}

The dataframe has column headings:

jday, sza, no2col, no2err, qaflag, fitflag, year, month, day, hour_utc, minute

Parameters
  • pandora_filepath (str) – The path to the pandora file

  • no2col (str) – Whether to get all values ‘Tot’ or tropospheric values only ‘Trop’

Returns

A tuple of the position dictionary and the dataframe

Return type

tuple(dict, pandas.dataframe)