DataStore, Layers and legacy files

Hi all,

I am the guy working on the "Fielded Text" standard mentioned in 
December.  I just did an update to the standard 
(http://www.fieldedtext.org/Standard) and while doing so, gave some 
thought regarding the aims of Fielded Text compared to the aims of "CSV 
on the Web".

Fielded Text focuses on standardising the encoding and decoding of 
tabular data into text for transport purposes.  It aims to support as 
wide as possible range of text formats (delimited and fixed width) and 
to provide as much compatibility with existing text files as possible.  
The Meta in fielded text is limited to only that which is needed for 
encoding and decoding. It recognises attributes or behaviour that are 
implicit in the text files with tabular data (eg. headings, comments, 
null values) and only adds a few considered essential to encoding (eg. 
typed fields, field names and Ids).  It also adds some attributes to 
support round tripping (eg. write formats).

A good analogy to Fielded Text is string encoding/decoding.  If you want 
to move text from one system to a different system, you will encode it 
to one of the well known formats (say UTF-8 or a MBCS). The person at 
the other end will then able to decode it using standardised methods to 
import the text into their system.

As I see it, "CSV on the Web" is more focused on publishing (as opposed 
to transport).  The Meta data for "CSV on the Web" assigns a far greater 
number of attributes to the tabular data. The aim with this seems to be 
to provide more information about the data within the files, describe 
linkages between files, assist with transformations and control access 
to them.  In my view it seems like it's aiming to be a Text Database 
focused on publishing, using CSV as the data store.

After I considered the above, I realised that Fielded Text covers a 
subset of "CSV on the Web".  Specifically access to tabular data in the 
data store.

In .NET Microsoft defined a couple of interfaces which could be 
construed as providing layers to data store access.  These are 
IDataRecord and IDataReader.

These are documented at:
- 
https://msdn.microsoft.com/en-us/library/system.data.idatarecord%28v=vs.110%29.aspx
- 
https://msdn.microsoft.com/en-us/library/system.data.idatareader%28v=vs.110%29.aspx

It was surprisingly easy to implement these interfaces in my 
implementation of FieldedText:
http://sourceforge.net/p/tfieldedtext/code/ci/default/tree/delphi/2/Xilytix.FieldedText.DotNetDataReader.pas

After having said all of the above, here is a suggestion.

If the "CSV on the Web" defined layers similar to the above for 
accessing the Data Store, other standards such as Fielded Text could be 
used to specify the implementation of the Data Store.

For example, "CSV on the Web" Meta would define a field's name, data 
type and headings and then Fielded Text's Meta would define how that 
field is actually stored (Delimited or Fixed Width, delimiter character, 
format picture strings).

The upside would be access to different types of data stores, 
potentially providing access to a large number of 'legacy' text files.  
The downside is that the standard is less constrained and 
implementations are more difficult to implement or may not provide 
complete coverage.

Anyway, I am just floating it as an idea.  Hopefully you consider it 
relevant.

Regards
Paul

Received on Wednesday, 25 February 2015 08:40:34 UTC