Class CruTs2pt1Supplier
- All Implemented Interfaces:
IDataSupplier
@Deprecated public class CruTs2pt1Supplier extends Object implements IDataSupplier
The class reads a set of recordHolders (files) in a source (directory) and
provides the recordHolders as an IDataset
containing
IRecordHolder
objects, each containing IRecord
objects (rows). Each dataset and recordHolder comes with an IMetadata
object.
IReportingListener
objects may be registered with objects of
this class to receive suitable progress reporting and messaging. In general,
exceptions not dealt with internally are re-thrown for calling objects to
deal with. Messages are user-friendly.
Note that because instance variables will hold a wide variety of information on pervious writes, it is essential that for each new set of files / dataset a new instance of this class is used.
This parser works on climate data files produced by Dr Tim Mitchell (archived homepage) while at the Tyndall Centre for Climate Change Research and released 23rd January 2004. It is designed for CRU TS 2.1, but will work for CRU TS 2.0 (indeed, some 2.1 is distributed in 2.0 files).
The data comprises climate records for the global land surface interpolated from real observations to a 0.5 degree grid at a monthly time series1. Data for nine observed and derived variables are available (temperature, diurnal temperature range, daily minimum and maximum temperatures, precipitation, wet-day frequency, frost-day frequency, vapour pressure, and cloud cover), but each file contains only one.
The data format is bespoke2. A description can be found via the data homepage.
NB: The data has been superseded, most recently by CRU TS 4.04 (24 April 2020: data, associated papers, and GoogleEarth visualisations).
Notes:
2The headers note these are "grim" files, and there is some belief online that they are "GPS Receiver Interface Module (GRIM)" files, but this is likely to be a typo for "grid" files, which is how they are described by Mitchell elsewhere. They are similar to other multi-channel raster formats and flat ACSII grid formats like ESRI's ARCINFO GRID format.
- Author:
- Andy Evans
- To Do:
- It's likely we could make an more abstract file parser at some point, set up by a properties file., Localise metadata notes?
- Version: 1.0 01 Mar 2021
-
Field Summary
Fields Modifier and Type Field Description private BufferedReader
buffer
Deprecated.Main file connection.private int
dataTokenWidth
Deprecated.Width of a data column in width-delimited data.private boolean
debug
Deprecated.Debugging flag, set by System variable passed in-Ddebug=true
rather than setting here / with accessor.private int
endYear
Deprecated.As the data isn't marked up for date beyond info in the header.private ArrayList<String>
fieldNames
Deprecated.Output field names for this file type.private ArrayList<Class>
fieldTypes
Deprecated.Output field classes for this file type.private ArrayList<IDataConsumer>
listeners
Deprecated.Register for data consumers wishing to listen for pushed data.private String
metadataDatePattern
Deprecated.Format for any dates in source header.private int
numberOfHeaderLines
Deprecated.Number of lines to read for data source header.private int
progress
Deprecated.For monitoring progress at reading.private ArrayList<String>
recordHolderNames
Deprecated.Source filenames for reading.private ArrayList<IReportingListener>
reportingListeners
Deprecated.Listeners interested in updates on progress.private File
source
Deprecated.Source directory for reading.private int
startYear
Deprecated.As the data isn't marked up for date beyond info in the header.private TabulatedDataset
tabulatedDataset
Deprecated.Store for all tables in this dataset.private int
valuesPerYear
Deprecated.As the data isn't marked up for date beyond info in the header.private int
years
Deprecated.Could calculate this when needed locally, but better to do it once. -
Constructor Summary
Constructors Constructor Description CruTs2pt1Supplier()
Deprecated.Generic constructor. -
Method Summary
Modifier and Type Method Description void
addDataListener(IDataConsumer consumer)
Deprecated.Register for data pushes.void
addReportingListener(IReportingListener reportingListener)
Deprecated.For objects wishing to get progress reports on data reading.void
connectSource(int index)
Deprecated.Connects to a record holder (e.g.void
disconnectSource()
Deprecated.Disconnects from current source and any file.private int
estimateRecordCount(int index)
Deprecated.This estimates the records in the file.private void
gapFillLocalisedGUIText()
Deprecated.Sets the defaults for warnings and exceptions in English if an appropriate language properties file is missing.IDataset
getDataset()
Deprecated.Gets the dataset.ArrayList<String>
getFieldNames()
Deprecated.Gets the names of fields.ArrayList<Class>
getFieldTypes()
Deprecated.Gets the type of fields.private ArrayList<IRecord>
getParsedDataBlockAsRows(Table table)
Deprecated.Reads a data block and turns it into records.ArrayList<String>
getRecordHolderNames()
Deprecated.Gets the names of files to read.File
getSource()
Deprecated.Gets the source file.void
initialise()
Deprecated.Sets up the data supplier ready to read data.private void
initialiseFields()
Deprecated.Sets up the fields for this data type.private void
parseHeader(int index)
Deprecated.Parses the header of the data source.void
pushData()
Deprecated.Pushes data to consumers registered as data listeners.void
readData()
Deprecated.Fills the dataset with data.private ArrayList<String>
readLines(int numberOfLines)
Deprecated.Reads a set of lines and returns them as an unparsed ArrayList of Strings.void
readTable(Table table)
Deprecated.Fills a table with data.void
reportMessage(String message)
Deprecated.Reports message to reportingListeners.void
reportProgress(int progress, int total)
Deprecated.Reports progress to reportingListeners.void
reportProgress(int progress, IDataset dataset)
Deprecated.Reports progress to reportingListeners.void
setRecordHolderNames(ArrayList<String> recordHolderNames)
Deprecated.Sets the names of files to read.void
setSource(File source)
Deprecated.Connect to aFile
.private void
setupDataset()
Deprecated.Sets up a data structure ready for the data.
-
Field Details
-
debug
private boolean debugDeprecated.Debugging flag, set by System variable passed in-Ddebug=true
rather than setting here / with accessor. -
source
Deprecated.Source directory for reading. -
recordHolderNames
Deprecated.Source filenames for reading. -
tabulatedDataset
Deprecated.Store for all tables in this dataset. -
fieldNames
Deprecated.Output field names for this file type. -
fieldTypes
Deprecated.Output field classes for this file type. -
buffer
Deprecated.Main file connection. -
listeners
Deprecated.Register for data consumers wishing to listen for pushed data. -
reportingListeners
Deprecated.Listeners interested in updates on progress. -
numberOfHeaderLines
private int numberOfHeaderLinesDeprecated.Number of lines to read for data source header. -
metadataDatePattern
Deprecated.Format for any dates in source header. -
startYear
private int startYearDeprecated.As the data isn't marked up for date beyond info in the header. -
endYear
private int endYearDeprecated.As the data isn't marked up for date beyond info in the header. -
years
private int yearsDeprecated.Could calculate this when needed locally, but better to do it once. -
valuesPerYear
private int valuesPerYearDeprecated.As the data isn't marked up for date beyond info in the header.- To Do:
- Calculate this from file.
-
dataTokenWidth
private int dataTokenWidthDeprecated.Width of a data column in width-delimited data. -
progress
private int progressDeprecated.For monitoring progress at reading.
-
-
Constructor Details
-
CruTs2pt1Supplier
public CruTs2pt1Supplier()Deprecated.Generic constructor.
-
-
Method Details
-
initialise
Deprecated.Sets up the data supplier ready to read data.It creates the relevant internal data structures in preparation for reading the data, including reading in and parsing file headers.
Note that this method is kept separate from the
setSource
andrecordHolderNames
methods to enable piped data processing implementations where a series of suppliers and consumers set up prior to activation. However,setSource
andrecordHolderNames
must be called prior to this method being called so it has something to initialise.If a source path and record holder names haven't been set using the
setSource
/setRecordHolderNames
methods, this method throws aParseFailedException
.- Specified by:
initialise
in interfaceIDataSupplier
- Throws:
ParseFailedException
- Usually if there is an issue reading a file; e.g. the wrong file type, the file has no data, or the source file is missing. It makes sense for callers to cancel further attempts at reading at this point.
-
initialiseFields
private void initialiseFields()Deprecated.Sets up the fields for this data type.For this data type, fields, and their type in the system, are:
- Xref
- java.math.BigDecimal
- Yref
- java.math.BigDecimal
- Date
- java.util.GregorianCalendar
- Value
- java.math.BigDecimal
-
setupDataset
private void setupDataset()Deprecated.Sets up a data structure ready for the data. -
estimateRecordCount
private int estimateRecordCount(int index)Deprecated.This estimates the records in the file.It is an accurate estimate here, but generally this estimate within this system should only be used for progress measurement and reporting no data files - not an accurate count of records actually read.
- Parameters:
index
- The position of the data to connect to in recordHolderNames.- Returns:
- int Estimate of record count.
-
readLines
Deprecated.Reads a set of lines and returns them as an unparsed ArrayList of Strings.Returns
null
only if all lines pulled bynumberOfLines
arenull
. It's therefore possible to get a smaller than expectedArrayList
at the end of a file whose size to read is not "% numberOfLines == 0
". However, the next call will returnnull
.- Parameters:
numberOfLines
- Number of lines to read.- Returns:
- ArrayList Strings, one per line, or null at the end of the file.
- Throws:
ParseFailedException
- If there is an issue.
-
readData
Deprecated.Fills the dataset with data.Primitives are boxed.
- Specified by:
readData
in interfaceIDataSupplier
- Throws:
ParseFailedException
- If there is an issue.
-
readTable
Deprecated.Fills a table with data.- Parameters:
table
- Table to add rows to.- Throws:
ParseFailedException
- If there is an issue.
-
parseHeader
Deprecated.Parses the header of the data source.The data is used for internal data parsing, but is also written to the dataset metadata "notes" category.
Probably the most significant things this method does is set the dataset metadata tag "title" to the third line of the header, which should be the data type "CRU TS 2.1" (if you read in 2.0 files or a mix it will be whatever is read in last). This becomes the dataset name when processed. It also sets each record holder (file / table) "title" to the second line, which should be the shortened observation type, for example ".pre = precipitation (mm)" becomes "pre", adding the following information:
- start year
- end year
- number of the file read, starting with one
- Parameters:
index
- The position of the table to connect to in recordHolderNames.- Throws:
ParseFailedException
- This exception should get passed back to the caller ofinitialise
to end attempts at reading. Contains the message "Having difficulty reading this file. Are you sure it is CRU TS 2.x format?"- To Do:
- Detailed reporting of poor quality header information., Need to get time metadata from the files.
-
getParsedDataBlockAsRows
Deprecated.Reads a data block and turns it into records.In this file format a data block is a Xref/Yref header plus a set of rows representing years. Values across a row are monthly. We therefore read a block at a time rather than a row at a time.
Reports progress to any ReportingListeners.
- Parameters:
table
- This is used to connect rows with parent tables.- Returns:
- ArrayList An ArrayList of rows, each row containing data in the appropriate field order.
- Throws:
ParseFailedException
- If there's an issue.- See Also:
getFieldNames()
,getFieldTypes()
-
pushData
Deprecated.Pushes data to consumers registered as data listeners.The method reads a data block at a time and pushes it to registered data consumers for processing by calling their
load(ArrayList<IRecords> records)
method when reading completed.Garbage collects at the end of each push.
- Specified by:
pushData
in interfaceIDataSupplier
- Throws:
ParseFailedException
- If there is an issue.- See Also:
addDataListener(IDataConsumer consumer)
-
setSource
Deprecated.Connect to aFile
.Reading begun under
initialisation
.- Specified by:
setSource
in interfaceIDataSupplier
- Parameters:
source
- Source file to read.- Throws:
ParseFailedException
- Not used in this implementation.
-
getSource
Deprecated.Gets the source file.- Specified by:
getSource
in interfaceIDataSupplier
- Returns:
- File The source file.
-
setRecordHolderNames
Deprecated.Sets the names of files to read.- Specified by:
setRecordHolderNames
in interfaceIDataSupplier
- Parameters:
recordHolderNames
- ArrayList of names.- Throws:
ParseFailedException
- Not used in this implementation.- See Also:
IDataSupplier.initialise()
-
getRecordHolderNames
Deprecated.Gets the names of files to read.- Specified by:
getRecordHolderNames
in interfaceIDataSupplier
- Returns:
- recordHolderNames ArrayList of names.
-
getFieldNames
Deprecated.Gets the names of fields.- Returns:
- ArrayList ArrayList of names.
-
getFieldTypes
Deprecated.Gets the type of fields.Primitives are boxed.
- Returns:
- ArrayList ArrayList of Classes.
-
getDataset
Deprecated.Gets the dataset.Note that the dataset will not be implemented and filled with fields and metadata until
initialise
called. It will not be filled with data untilreadData
called.- Specified by:
getDataset
in interfaceIDataSupplier
- Returns:
- IDataset The dataset.
- See Also:
IDataSupplier.pushData()
-
addDataListener
Deprecated.Register for data pushes.- Specified by:
addDataListener
in interfaceIDataSupplier
- Parameters:
consumer
- Data consumer.- See Also:
IDataConsumer.load(ArrayList<IRecord> records)
,IDataSupplier.pushData()
-
addReportingListener
Deprecated.For objects wishing to get progress reports on data reading.- Specified by:
addReportingListener
in interfaceIDataSupplier
- Parameters:
reportingListener
- Object wishing to gain reports.- See Also:
IReportingListener
-
connectSource
Deprecated.Connects to a record holder (e.g. file) in the current source (directory).- Specified by:
connectSource
in interfaceIDataSupplier
- Parameters:
index
- Index of record holder to connect to in collection set usingsetRecordHolderNames
.- Throws:
ParseFailedException
- Only if there is an issue.
-
disconnectSource
Deprecated.Disconnects from current source and any file.Forces a garbage collection.
- Specified by:
disconnectSource
in interfaceIDataSupplier
- Throws:
ParseFailedException
- Not thrown in this implementation.
-
gapFillLocalisedGUIText
private void gapFillLocalisedGUIText()Deprecated.Sets the defaults for warnings and exceptions in English if an appropriate language properties file is missing. -
reportProgress
Deprecated.Reports progress to reportingListeners.Reports if progress is a multiple of total records / 100. If progress is zero or less, reports progress as 0 of 1.
- Parameters:
progress
- Progress in record processing.dataset
- Dataset to extract estimate of processing to be done.
-
reportProgress
public void reportProgress(int progress, int total)Deprecated.Reports progress to reportingListeners.Reports for an arbitrary progress and total worked towards.
- Parameters:
progress
- Value indicating progress through work total.total
- Value indicating total work to do.
-
reportMessage
Deprecated.Reports message to reportingListeners.- Parameters:
message
- Message to reporting listeners.
-