Tools
Contents
- 1 FileMaker
- 2 CSVKit
- 3 CSVLint
- 4 CSV Schema Language and CSV Validator
- 5 Datapipes
- 6 Python's CSV module
- 7 PHP's built-in CSV parser
- 8 node-csv
- 9 Tablinker
- 10 PLSheet
- 11 Harmonize
- 12 Microsoft Excel
- 13 iWorks' Number
- 14 Apache Open Office's Calc
- 15 LibreOffice's Calc
- 16 q
- 17 sheetsee.js
- 18 dat
- 19 Timeline
- 20 DataUp
- 21 COOPY
- 22 Tableau Public
- 23 Google Sheets
- 24 QuickOffice
- 25 Google Fusion Tables
- 26 Google Public Data Explorer
- 27 TARQL
- 28 MR Data Converter
- 29 Bio Table
- 30 OpenDataPress
- 31 Grinder
- 32 Tablib
- 33 TextQL
- 34 Unsorted
FileMaker
Relational Database system from an Apple subsidiary. Can import and export CSV (both comma and tab separated). There is also a simplified version of FileMaker (called Bento) that also runs on IOS; development of Bento stopped in autumn 2014.
CSVKit
A suite of utilities for converting to and working with CSV, the king of tabular file formats
https://github.com/onyxfish/csvkit
CSVLint
WIP?
http://csvlint.io/ - https://github.com/theodi/csvlint
CSV Schema Language and CSV Validator
The National Archives (UK)
https://github.com/digital-preservation/csv-validator/
XSLT for converting CSV to XML: https://github.com/digital-preservation/csv-tools/
Datapipes
Data Pipes is a service to provide streaming, "pipe-like" data transformations on the web – things like deleting rows or columns, find and replace, head, grep etc.
Python's CSV module
http://docs.python.org/release/2.7/library/csv.html
It is part of the official Python distribution since version 2.3. Provides a simple shell around CSV files to read or write line-by-line, possibly use the header row's elements as keys. It is customizable to handle different separation characters (comman, tab, etc.). It uses Python's iteration model around the standard file object, meaning that it is usable with possibly large files.
PHP's built-in CSV parser
php.net/fgetcsv
node-csv
http://www.adaltas.com/projects/node-csv/
Streaming CSV parser for node.js
Tablinker
Tablinker is an experimental software for converting manually annotated Microsoft Excel workbooks to the RDF Data Cube vocabulary.
https://github.com/Data2Semantics/TabLinker
PLSheet
PLSheet is a SWI-Prolog library for analyzing ODF spreadsheets.
https://github.com/Data2Semantics/PLSheet
Harmonize
Harmonize is a tool (prototype) for normalizing and aligning RDF Data Cube datasets, and getting CSVs out of it.
http://lod.cedar-project.nl:8082/harmonize
https://github.com/CEDAR-project/Harmonize
Microsoft Excel
Seemingly ubiquitous tool for working with tabular data (aka "spread sheets")
MS Excel compatibility test results using files provided in CSV+ Syntax document.
iWorks' Number
Apple's office suite, becoming fairly wide-spread on Macs and IOS platforms; "Number" is the spread-sheet tool within the suite, that can import/export CSV.
Number compatibility test results using files provided in CSV+ Syntax document (ran on Mac, not on IOS).
Apache Open Office's Calc
Open Source Office suite; Calc is the spread sheet tool within the suite that can import/export CSV. One of the two 'standard' office suites on Linux distributions (the other being LibreOffice).
Calc compatibility test results using files provided in CSV+ Syntax document
LibreOffice's Calc
Open Source Office suite; Calc is the spread sheet tool within the suite that can import/export CSV. One of the two 'standard' office suites on Linux distributions (the other being Open Office).
Calc compatibility test results using files provided in CSV+ Syntax document
q
command-line SQL query interface for tabular text data
sheetsee.js
"a client-side library for connecting Google Spreadsheets to a website and visualizing the information in tables, maps and charts"
http://jlord.github.io/sheetsee.js/
dat
"real-time replication and versioning for large tabular data sets"
Timeline
"TimelineJS is an open-source tool that enables anyone to build visually, rich, interactive timelines. Beginners can create a timeline using nothing more than a Google spreadsheet."
http://timeline.knightlab.com/
DataUp
"An open source tool helping researchers document, manage, and archive their tabular data, DataUp operates within the scientist's workflow and integrates with Microsoft® Excel."
COOPY
Revision control for tables
Tableau Public
Tableau Public is a free interactive data visualization product focused on business intelligence. It is frequently used by journalists to visualize, analyze and tell stories about data. See, for example, the work of La Nacion in Buenos Aires, Argentina.
Google Sheets
QuickOffice
Google Fusion Tables
https://www.google.com/fusiontables/
Google Public Data Explorer
http://www.google.com/publicdata/
About the Public Data Explorer
TARQL
https://github.com/cygri/tarql
Tarql is a command-line tool for converting CSV files to RDF using SPARQL 1.1 syntax. It's written in Java and based on Apache ARQ.
MR Data Converter
http://shancarter.github.io/mr-data-converter/
Client-side conversion of CSV to JSON (3 variations), XML (3 variations) and other formats.
Bio Table
http://rubygems.org/gems/bio-table
"Functions and tools for tranforming and changing tab delimited and comma separated table files - useful for Excel sheets and SQL/RDF output"
OpenDataPress
"Easily create Open Data from Google Spreadsheets"
Grinder
(sorry about the poorly chosen name!) Converts tabular data (XLSX,XLS,TSV,CSV,PSV) into an XML data structure then passes it to XSLT for conversion (generally into RDF+XML)
https://github.com/cgutteridge/Grinder
Also worth looking at the command line options: https://github.com/cgutteridge/Grinder/blob/master/bin/grinder#L91
Tablib
Tablib is an MIT Licensed format-agnostic tabular dataset library, written in Python. It allows you to import, export, and manipulate tabular data sets. Advanced features include, segregation, dynamic columns, tags & filtering, and seamless format import & export.
http://tablib.readthedocs.org/en/latest/
TextQL
TextQL is an MIT Licensed command-line application that allows one to execute SQL commands against tabular (CSV, TSV) data.
https://github.com/dinedal/textql
Unsorted
Copied in from WebSchemas wiki.
R, Mathematica, Matlab and Octave
- The R language has a 'data frame' construction, http://www.r-tutor.com/r-introduction/data-frame
- Octave has http://www.gnu.org/software/octave/doc/interpreter/Cell-Arrays.html and also http://octave.sourceforge.net/dataframe/overview.html which is close to the R approach.
- Matlab has recently added a similar construct, see http://www.mathworks.se/help/matlab/tables.html (via http://www.mathworks.se/help/matlab/release-notes.html http://www.mathworks.se/products/matlab/whatsnew.html )
- Matlab tables construct may be closer to matlab/octave structure (named fields) than to a cell array (though they're related); also new in Matlab 2013b. (via b_jonas, jwe in #octave IRC.freenode.net)
- http://pandas.pydata.org/ has data frame in python
- R/JSON/D3 discussion
- See http://www.wolfram.com/mathematica/new-in-9/built-in-integration-with-r/create-and-display-data-frames.html http://mathematica.stackexchange.com/questions/19136/creating-a-r-dataframe-like-construct-in-mathematica