Tools

From CSV on the Web Working Group Wiki
Jump to: navigation, search

FileMaker

Relational Database system from an Apple subsidiary. Can import and export CSV (both comma and tab separated). There is also a simplified version of FileMaker (called Bento) that also runs on IOS; development of Bento stopped in autumn 2014.

CSVKit

A suite of utilities for converting to and working with CSV, the king of tabular file formats

https://github.com/onyxfish/csvkit

CSVLint

WIP?

http://csvlint.io/ - https://github.com/theodi/csvlint

CSV Schema Language and CSV Validator

The National Archives (UK)

https://github.com/digital-preservation/csv-validator/

XSLT for converting CSV to XML: https://github.com/digital-preservation/csv-tools/

Datapipes

Data Pipes is a service to provide streaming, "pipe-like" data transformations on the web – things like deleting rows or columns, find and replace, head, grep etc.

http://datapipes.okfnlabs.org

Python's CSV module

http://docs.python.org/release/2.7/library/csv.html

It is part of the official Python distribution since version 2.3. Provides a simple shell around CSV files to read or write line-by-line, possibly use the header row's elements as keys. It is customizable to handle different separation characters (comman, tab, etc.). It uses Python's iteration model around the standard file object, meaning that it is usable with possibly large files.

PHP's built-in CSV parser

php.net/fgetcsv

node-csv

http://www.adaltas.com/projects/node-csv/

Streaming CSV parser for node.js

Tablinker

Tablinker is an experimental software for converting manually annotated Microsoft Excel workbooks to the RDF Data Cube vocabulary.

https://github.com/Data2Semantics/TabLinker

PLSheet

PLSheet is a SWI-Prolog library for analyzing ODF spreadsheets.

https://github.com/Data2Semantics/PLSheet

Harmonize

Harmonize is a tool (prototype) for normalizing and aligning RDF Data Cube datasets, and getting CSVs out of it.

http://lod.cedar-project.nl:8082/harmonize

https://github.com/CEDAR-project/Harmonize

Microsoft Excel

Seemingly ubiquitous tool for working with tabular data (aka "spread sheets")

MS Excel compatibility test results using files provided in CSV+ Syntax document.

iWorks' Number

Apple's office suite, becoming fairly wide-spread on Macs and IOS platforms; "Number" is the spread-sheet tool within the suite, that can import/export CSV.

Number compatibility test results using files provided in CSV+ Syntax document (ran on Mac, not on IOS).

Apache Open Office's Calc

Open Source Office suite; Calc is the spread sheet tool within the suite that can import/export CSV. One of the two 'standard' office suites on Linux distributions (the other being LibreOffice).

Calc compatibility test results using files provided in CSV+ Syntax document

LibreOffice's Calc

Open Source Office suite; Calc is the spread sheet tool within the suite that can import/export CSV. One of the two 'standard' office suites on Linux distributions (the other being Open Office).

Calc compatibility test results using files provided in CSV+ Syntax document

q

command-line SQL query interface for tabular text data

https://github.com/harelba/q

sheetsee.js

"a client-side library for connecting Google Spreadsheets to a website and visualizing the information in tables, maps and charts"

http://jlord.github.io/sheetsee.js/

dat

"real-time replication and versioning for large tabular data sets"

http://dat-data.com/

Timeline

"TimelineJS is an open-source tool that enables anyone to build visually, rich, interactive timelines. Beginners can create a timeline using nothing more than a Google spreadsheet."

http://timeline.knightlab.com/

DataUp

"An open source tool helping researchers document, manage, and archive their tabular data, DataUp operates within the scientist's workflow and integrates with Microsoft® Excel."

http://dataup.cdlib.org/

COOPY

Revision control for tables

http://share.find.coop/

Tableau Public

Tableau Public is a free interactive data visualization product focused on business intelligence. It is frequently used by journalists to visualize, analyze and tell stories about data. See, for example, the work of La Nacion in Buenos Aires, Argentina.

Google Sheets

Overview of Google Sheets

QuickOffice

What is QuickOffice?

Google Fusion Tables

https://www.google.com/fusiontables/

About Fusion Tables

Google Public Data Explorer

http://www.google.com/publicdata/

About the Public Data Explorer

TARQL

https://github.com/cygri/tarql

Tarql is a command-line tool for converting CSV files to RDF using SPARQL 1.1 syntax. It's written in Java and based on Apache ARQ.

MR Data Converter

http://shancarter.github.io/mr-data-converter/

Client-side conversion of CSV to JSON (3 variations), XML (3 variations) and other formats.

Bio Table

http://rubygems.org/gems/bio-table

"Functions and tools for tranforming and changing tab delimited and comma separated table files - useful for Excel sheets and SQL/RDF output"

OpenDataPress

http://opendatapress.org/

"Easily create Open Data from Google Spreadsheets"

Grinder

(sorry about the poorly chosen name!) Converts tabular data (XLSX,XLS,TSV,CSV,PSV) into an XML data structure then passes it to XSLT for conversion (generally into RDF+XML)

https://github.com/cgutteridge/Grinder

Also worth looking at the command line options: https://github.com/cgutteridge/Grinder/blob/master/bin/grinder#L91

Tablib

Tablib is an MIT Licensed format-agnostic tabular dataset library, written in Python. It allows you to import, export, and manipulate tabular data sets. Advanced features include, segregation, dynamic columns, tags & filtering, and seamless format import & export.

http://tablib.readthedocs.org/en/latest/

TextQL

TextQL is an MIT Licensed command-line application that allows one to execute SQL commands against tabular (CSV, TSV) data.

https://github.com/dinedal/textql

Unsorted

Copied in from WebSchemas wiki.

R, Mathematica, Matlab and Octave