D = dbload(FName, ...)
D = dbload(D,FName, ...)
FName
[ char | cellstr ] - Name of the Input CSV data file or a cell array of CSV file names that will be combined.
D
[ struct ] - An existing database (struct) to which the new entries from the input CSV data file entries will be added.
D
[ struct ] - Database created from the input CSV file(s).'case='
[ 'lower'
| 'upper'
| empty ] - Change case of variable names.
'commentRow='
[ char | cellstr | {'comment','comments'}
] - Label at the start of row that will be used to create tseries object comments.
'dateFormat='
[ char | 'YYYYFP'
] - Format of dates in first column.
'delimiter='
[ char | ','
] - Delimiter separating the individual values (cells) in the CSV file; if different from a comma, all occurences of the delimiter will replaced with commas -- note that this will also affect text in comments.
'firstDateOnly='
[ true
| false
] - Read and parse only the first date string, and fill in the remaining dates assuming a range of consecutive dates.
'freq='
[ 0
| 1
| 2
| 4
| 6
| 12
| 365
| 'daily'
| empty ] - Advise frequency of dates; if empty, frequency will be automatically recognised.
'freqLetters='
[ char | 'YHQBM'
] - Letters representing frequency of dates in date column.
'inputFormat='
[ 'auto'
| 'csv'
| 'xls'
] - Format of input data file; 'auto'
means the format will be determined by the file extension.
'nameRow='
[ char | numeric | {'','Variables'}
] - String, or cell array of possible strings, that is found at the beginning (in the first cell) of the row with variable names, or the line number at which the row with variable names appears (first row is numbered 1).
'nameFunc='
[ cell | function_handle | empty ] - Function used to change or transform the variable names. If a cell array of function handles, each function will be applied in the given order.
'nan='
[ char | NaN
] - String representing missing observations (case insensitive).
'preProcess='
[ function_handle | cell | empty ] - Apply this function, or cell array of functions, to the raw text file before parsing the data.
'select='
[ char | cellstr | empty ] - Only database entries included on this list will be read in and returned in the output database D
; entries not on this list will be discarded.
'skipRows='
[ char | cellstr | numeric | empty ] - Skip rows whose first cell matches the string or strings (regular expressions); or, skip a vector of row numbers.
'userData='
[ char | Inf
] - Field name under which the database userdata loaded from the CSV file (if they exist) will be stored in the output database; Inf
means the field name will be read from the CSV file (and will be thus identical to the originally saved database).
'userDataField='
[ char | '.'
] - A leading character denoting userdata fields for individual time series; if empty, no userdata fields will be read in and created.
'userDataFieldList='
[ cellstr | numeric | empty ] - List of row headers, or vector of row numbers, that will be included as user data in each time series.
Use the 'freq='
option whenever there is ambiguity in intepreting the date strings, and IRIS is not able to determine the frequency correctly (see Example).
The minimalist structure of a CSV database file has a leading row with variables names, a leading column with dates in the basic IRIS format, and individual columns with numeric data:
+---------+---------+---------+--
| | Y | P |
+---------+---------+---------+--
| 2010Q1 | 1 | 10 |
+---------+---------+---------+--
| 2010Q2 | 2 | 20 |
+---------+---------+---------+--
| | | |
You can add a comment row (must be placed before the data part, and start with a label 'Comment' in the first cell) that will also be read in and assigned as comments to the individual tseries objects created in the output database.
+---------+---------+---------+--
| | Y | P |
+---------+---------+---------+--
| Comment | Output | Prices |
+---------+---------+---------+--
| 2010Q1 | 1 | 10 |
+---------+---------+---------+--
| 2010Q2 | 2 | 20 |
+---------+---------+---------+--
| | | |
You can use a different label in the first cell to denote a comment row; in that case you need to set the option 'commentRow='
accordingly.
All CSV rows whose names start with a character specified in the option 'userdataField='
(a dot by default) will be added to output tseries objects as fields of their userdata.
+---------+---------+---------+--
| | Y | P |
+---------+---------+---------+--
| Comment | Output | Prices |
+---------+---------+---------+--
| .Source | Stat | IMFIFS |
+---------+---------+---------+--
| .Update | 17Feb11 | 01Feb11 |
+---------+---------+---------+--
| .Units | Bil USD | 2010=1 |
+---------+---------+---------+--
| 2010Q1 | 1 | 10 |
+---------+---------+---------+--
| 2010Q2 | 2 | 20 |
+---------+---------+---------+--
| | | |
Typical example of using the 'freq='
option is a quarterly database with dates represented by the corresponding months, such as a sequence 2000-01-01, 2000-04-01, 2000-07-01, 2000-10-01, etc. In this case, you can use the following options:
d = dbload('filename.csv','dateFormat','YYYY-MM-01','freq',4);