Class CruTs2pt1Supplier
- All Implemented Interfaces:
IDataSupplier
@Deprecated public class CruTs2pt1Supplier extends Object implements IDataSupplier
The class reads a set of recordHolders (files) in a source (directory) and
provides the recordHolders as an IDataset containing
IRecordHolder objects, each containing IRecord
objects (rows). Each dataset and recordHolder comes with an IMetadata
object.
IReportingListener objects may be registered with objects of
this class to receive suitable progress reporting and messaging. In general,
exceptions not dealt with internally are re-thrown for calling objects to
deal with. Messages are user-friendly.
Note that because instance variables will hold a wide variety of information on pervious writes, it is essential that for each new set of files / dataset a new instance of this class is used.
This parser works on climate data files produced by Dr Tim Mitchell (archived homepage) while at the Tyndall Centre for Climate Change Research and released 23rd January 2004. It is designed for CRU TS 2.1, but will work for CRU TS 2.0 (indeed, some 2.1 is distributed in 2.0 files).
The data comprises climate records for the global land surface interpolated from real observations to a 0.5 degree grid at a monthly time series1. Data for nine observed and derived variables are available (temperature, diurnal temperature range, daily minimum and maximum temperatures, precipitation, wet-day frequency, frost-day frequency, vapour pressure, and cloud cover), but each file contains only one.
The data format is bespoke2. A description can be found via the data homepage.
NB: The data has been superseded, most recently by CRU TS 4.04 (24 April 2020: data, associated papers, and GoogleEarth visualisations).
Notes:
2The headers note these are "grim" files, and there is some belief online that they are "GPS Receiver Interface Module (GRIM)" files, but this is likely to be a typo for "grid" files, which is how they are described by Mitchell elsewhere. They are similar to other multi-channel raster formats and flat ACSII grid formats like ESRI's ARCINFO GRID format.
- Author:
- Andy Evans
- To Do:
- It's likely we could make an more abstract file parser at some point, set up by a properties file., Localise metadata notes?
- Version: 1.0 01 Mar 2021
-
Field Summary
Fields Modifier and Type Field Description private BufferedReaderbufferDeprecated.Main file connection.private intdataTokenWidthDeprecated.Width of a data column in width-delimited data.private booleandebugDeprecated.Debugging flag, set by System variable passed in-Ddebug=truerather than setting here / with accessor.private intendYearDeprecated.As the data isn't marked up for date beyond info in the header.private ArrayList<String>fieldNamesDeprecated.Output field names for this file type.private ArrayList<Class>fieldTypesDeprecated.Output field classes for this file type.private ArrayList<IDataConsumer>listenersDeprecated.Register for data consumers wishing to listen for pushed data.private StringmetadataDatePatternDeprecated.Format for any dates in source header.private intnumberOfHeaderLinesDeprecated.Number of lines to read for data source header.private intprogressDeprecated.For monitoring progress at reading.private ArrayList<String>recordHolderNamesDeprecated.Source filenames for reading.private ArrayList<IReportingListener>reportingListenersDeprecated.Listeners interested in updates on progress.private FilesourceDeprecated.Source directory for reading.private intstartYearDeprecated.As the data isn't marked up for date beyond info in the header.private TabulatedDatasettabulatedDatasetDeprecated.Store for all tables in this dataset.private intvaluesPerYearDeprecated.As the data isn't marked up for date beyond info in the header.private intyearsDeprecated.Could calculate this when needed locally, but better to do it once. -
Constructor Summary
Constructors Constructor Description CruTs2pt1Supplier()Deprecated.Generic constructor. -
Method Summary
Modifier and Type Method Description voidaddDataListener(IDataConsumer consumer)Deprecated.Register for data pushes.voidaddReportingListener(IReportingListener reportingListener)Deprecated.For objects wishing to get progress reports on data reading.voidconnectSource(int index)Deprecated.Connects to a record holder (e.g.voiddisconnectSource()Deprecated.Disconnects from current source and any file.private intestimateRecordCount(int index)Deprecated.This estimates the records in the file.private voidgapFillLocalisedGUIText()Deprecated.Sets the defaults for warnings and exceptions in English if an appropriate language properties file is missing.IDatasetgetDataset()Deprecated.Gets the dataset.ArrayList<String>getFieldNames()Deprecated.Gets the names of fields.ArrayList<Class>getFieldTypes()Deprecated.Gets the type of fields.private ArrayList<IRecord>getParsedDataBlockAsRows(Table table)Deprecated.Reads a data block and turns it into records.ArrayList<String>getRecordHolderNames()Deprecated.Gets the names of files to read.FilegetSource()Deprecated.Gets the source file.voidinitialise()Deprecated.Sets up the data supplier ready to read data.private voidinitialiseFields()Deprecated.Sets up the fields for this data type.private voidparseHeader(int index)Deprecated.Parses the header of the data source.voidpushData()Deprecated.Pushes data to consumers registered as data listeners.voidreadData()Deprecated.Fills the dataset with data.private ArrayList<String>readLines(int numberOfLines)Deprecated.Reads a set of lines and returns them as an unparsed ArrayList of Strings.voidreadTable(Table table)Deprecated.Fills a table with data.voidreportMessage(String message)Deprecated.Reports message to reportingListeners.voidreportProgress(int progress, int total)Deprecated.Reports progress to reportingListeners.voidreportProgress(int progress, IDataset dataset)Deprecated.Reports progress to reportingListeners.voidsetRecordHolderNames(ArrayList<String> recordHolderNames)Deprecated.Sets the names of files to read.voidsetSource(File source)Deprecated.Connect to aFile.private voidsetupDataset()Deprecated.Sets up a data structure ready for the data.
-
Field Details
-
debug
private boolean debugDeprecated.Debugging flag, set by System variable passed in-Ddebug=truerather than setting here / with accessor. -
source
Deprecated.Source directory for reading. -
recordHolderNames
Deprecated.Source filenames for reading. -
tabulatedDataset
Deprecated.Store for all tables in this dataset. -
fieldNames
Deprecated.Output field names for this file type. -
fieldTypes
Deprecated.Output field classes for this file type. -
buffer
Deprecated.Main file connection. -
listeners
Deprecated.Register for data consumers wishing to listen for pushed data. -
reportingListeners
Deprecated.Listeners interested in updates on progress. -
numberOfHeaderLines
private int numberOfHeaderLinesDeprecated.Number of lines to read for data source header. -
metadataDatePattern
Deprecated.Format for any dates in source header. -
startYear
private int startYearDeprecated.As the data isn't marked up for date beyond info in the header. -
endYear
private int endYearDeprecated.As the data isn't marked up for date beyond info in the header. -
years
private int yearsDeprecated.Could calculate this when needed locally, but better to do it once. -
valuesPerYear
private int valuesPerYearDeprecated.As the data isn't marked up for date beyond info in the header.- To Do:
- Calculate this from file.
-
dataTokenWidth
private int dataTokenWidthDeprecated.Width of a data column in width-delimited data. -
progress
private int progressDeprecated.For monitoring progress at reading.
-
-
Constructor Details
-
CruTs2pt1Supplier
public CruTs2pt1Supplier()Deprecated.Generic constructor.
-
-
Method Details
-
initialise
Deprecated.Sets up the data supplier ready to read data.It creates the relevant internal data structures in preparation for reading the data, including reading in and parsing file headers.
Note that this method is kept separate from the
setSourceandrecordHolderNamesmethods to enable piped data processing implementations where a series of suppliers and consumers set up prior to activation. However,setSourceandrecordHolderNamesmust be called prior to this method being called so it has something to initialise.If a source path and record holder names haven't been set using the
setSource/setRecordHolderNamesmethods, this method throws aParseFailedException.- Specified by:
initialisein interfaceIDataSupplier- Throws:
ParseFailedException- Usually if there is an issue reading a file; e.g. the wrong file type, the file has no data, or the source file is missing. It makes sense for callers to cancel further attempts at reading at this point.
-
initialiseFields
private void initialiseFields()Deprecated.Sets up the fields for this data type.For this data type, fields, and their type in the system, are:
- Xref
- java.math.BigDecimal
- Yref
- java.math.BigDecimal
- Date
- java.util.GregorianCalendar
- Value
- java.math.BigDecimal
-
setupDataset
private void setupDataset()Deprecated.Sets up a data structure ready for the data. -
estimateRecordCount
private int estimateRecordCount(int index)Deprecated.This estimates the records in the file.It is an accurate estimate here, but generally this estimate within this system should only be used for progress measurement and reporting no data files - not an accurate count of records actually read.
- Parameters:
index- The position of the data to connect to in recordHolderNames.- Returns:
- int Estimate of record count.
-
readLines
Deprecated.Reads a set of lines and returns them as an unparsed ArrayList of Strings.Returns
nullonly if all lines pulled bynumberOfLinesarenull. It's therefore possible to get a smaller than expectedArrayListat the end of a file whose size to read is not "% numberOfLines == 0". However, the next call will returnnull.- Parameters:
numberOfLines- Number of lines to read.- Returns:
- ArrayList Strings, one per line, or null at the end of the file.
- Throws:
ParseFailedException- If there is an issue.
-
readData
Deprecated.Fills the dataset with data.Primitives are boxed.
- Specified by:
readDatain interfaceIDataSupplier- Throws:
ParseFailedException- If there is an issue.
-
readTable
Deprecated.Fills a table with data.- Parameters:
table- Table to add rows to.- Throws:
ParseFailedException- If there is an issue.
-
parseHeader
Deprecated.Parses the header of the data source.The data is used for internal data parsing, but is also written to the dataset metadata "notes" category.
Probably the most significant things this method does is set the dataset metadata tag "title" to the third line of the header, which should be the data type "CRU TS 2.1" (if you read in 2.0 files or a mix it will be whatever is read in last). This becomes the dataset name when processed. It also sets each record holder (file / table) "title" to the second line, which should be the shortened observation type, for example ".pre = precipitation (mm)" becomes "pre", adding the following information:
- start year
- end year
- number of the file read, starting with one
- Parameters:
index- The position of the table to connect to in recordHolderNames.- Throws:
ParseFailedException- This exception should get passed back to the caller ofinitialiseto end attempts at reading. Contains the message "Having difficulty reading this file. Are you sure it is CRU TS 2.x format?"- To Do:
- Detailed reporting of poor quality header information., Need to get time metadata from the files.
-
getParsedDataBlockAsRows
Deprecated.Reads a data block and turns it into records.In this file format a data block is a Xref/Yref header plus a set of rows representing years. Values across a row are monthly. We therefore read a block at a time rather than a row at a time.
Reports progress to any ReportingListeners.
- Parameters:
table- This is used to connect rows with parent tables.- Returns:
- ArrayList An ArrayList of rows, each row containing data in the appropriate field order.
- Throws:
ParseFailedException- If there's an issue.- See Also:
getFieldNames(),getFieldTypes()
-
pushData
Deprecated.Pushes data to consumers registered as data listeners.The method reads a data block at a time and pushes it to registered data consumers for processing by calling their
load(ArrayList<IRecords> records)method when reading completed.Garbage collects at the end of each push.
- Specified by:
pushDatain interfaceIDataSupplier- Throws:
ParseFailedException- If there is an issue.- See Also:
addDataListener(IDataConsumer consumer)
-
setSource
Deprecated.Connect to aFile.Reading begun under
initialisation.- Specified by:
setSourcein interfaceIDataSupplier- Parameters:
source- Source file to read.- Throws:
ParseFailedException- Not used in this implementation.
-
getSource
Deprecated.Gets the source file.- Specified by:
getSourcein interfaceIDataSupplier- Returns:
- File The source file.
-
setRecordHolderNames
Deprecated.Sets the names of files to read.- Specified by:
setRecordHolderNamesin interfaceIDataSupplier- Parameters:
recordHolderNames- ArrayList of names.- Throws:
ParseFailedException- Not used in this implementation.- See Also:
IDataSupplier.initialise()
-
getRecordHolderNames
Deprecated.Gets the names of files to read.- Specified by:
getRecordHolderNamesin interfaceIDataSupplier- Returns:
- recordHolderNames ArrayList of names.
-
getFieldNames
Deprecated.Gets the names of fields.- Returns:
- ArrayList ArrayList of names.
-
getFieldTypes
Deprecated.Gets the type of fields.Primitives are boxed.
- Returns:
- ArrayList ArrayList of Classes.
-
getDataset
Deprecated.Gets the dataset.Note that the dataset will not be implemented and filled with fields and metadata until
initialisecalled. It will not be filled with data untilreadDatacalled.- Specified by:
getDatasetin interfaceIDataSupplier- Returns:
- IDataset The dataset.
- See Also:
IDataSupplier.pushData()
-
addDataListener
Deprecated.Register for data pushes.- Specified by:
addDataListenerin interfaceIDataSupplier- Parameters:
consumer- Data consumer.- See Also:
IDataConsumer.load(ArrayList<IRecord> records),IDataSupplier.pushData()
-
addReportingListener
Deprecated.For objects wishing to get progress reports on data reading.- Specified by:
addReportingListenerin interfaceIDataSupplier- Parameters:
reportingListener- Object wishing to gain reports.- See Also:
IReportingListener
-
connectSource
Deprecated.Connects to a record holder (e.g. file) in the current source (directory).- Specified by:
connectSourcein interfaceIDataSupplier- Parameters:
index- Index of record holder to connect to in collection set usingsetRecordHolderNames.- Throws:
ParseFailedException- Only if there is an issue.
-
disconnectSource
Deprecated.Disconnects from current source and any file.Forces a garbage collection.
- Specified by:
disconnectSourcein interfaceIDataSupplier- Throws:
ParseFailedException- Not thrown in this implementation.
-
gapFillLocalisedGUIText
private void gapFillLocalisedGUIText()Deprecated.Sets the defaults for warnings and exceptions in English if an appropriate language properties file is missing. -
reportProgress
Deprecated.Reports progress to reportingListeners.Reports if progress is a multiple of total records / 100. If progress is zero or less, reports progress as 0 of 1.
- Parameters:
progress- Progress in record processing.dataset- Dataset to extract estimate of processing to be done.
-
reportProgress
public void reportProgress(int progress, int total)Deprecated.Reports progress to reportingListeners.Reports for an arbitrary progress and total worked towards.
- Parameters:
progress- Value indicating progress through work total.total- Value indicating total work to do.
-
reportMessage
Deprecated.Reports message to reportingListeners.- Parameters:
message- Message to reporting listeners.
-