Difference between revisions of "TTB"

From ProgClub
Jump to: navigation, search
(first thoughts about TTB data format...)
 
(thinking, thinking...)
Line 1: Line 1:
The text tables format is a data-format that is less bandwidth intensive than JSON, but that can readily be converted to JSON.
+
The text tables format is a data-format that is potentially less bandwidth intensive than JSON (because column names are not repeated), but that can readily be converted to JSON.
  
 
  #
 
  #
Line 14: Line 14:
 
  27          \t sudo \t Administrators
 
  27          \t sudo \t Administrators
 
  500        \t user \t Users
 
  500        \t user \t Users
 
+
 
  member_group:
 
  member_group:
 
  uid:int \t gid:int
 
  uid:int \t gid:int
Line 72: Line 72:
 
* float
 
* float
 
* string
 
* string
* text (url-encoded)
+
* text (URL encoded, and auto-decoded)
 +
* json (JSON encoded, and auto-decoded)
 +
** note: json inputs must not contain control characters (e.g. tab/new-line), if that can happen JSON inputs will need to be URL encoded JSON, requires investigation.
  
 
Other types to consider (not used presently):
 
Other types to consider (not used presently):

Revision as of 18:24, 11 May 2017

The text tables format is a data-format that is potentially less bandwidth intensive than JSON (because column names are not repeated), but that can readily be converted to JSON.

#
# The data format supports hash comments before tables
#

member: (content here is ignored until after new line)
uid:int:key \t wiki_name \t join_date
1000        \t John      \t 2011-07-25
1001        \t Tasaio    \t 2011-07-26

group:
gid:int:key \t name \t description
27          \t sudo \t Administrators
500         \t user \t Users

member_group:
uid:int \t gid:int
1000    \t 27
1000    \t 500
1001    \t 27
1001    \t 500

The JSON would be:

var data = {
  "member": [
    { "uid": 1000, "wiki_name": "John", "join_date": "2011-07-25" },
    { "uid": 1001, "wiki_name": "Tasaio", "join_date": "2011-07-26" }
  ],
  "group": [
    { "gid": 27, "name": "sudo", "description": "Administrators" },
    { "gid": 500, "name": "user", "description": "Users" }
  ],
  "member_group": [
    { "uid": 1000, "gid": 27 },
    { "uid": 1000, "gid": 500 },
    { "uid": 1001, "gid": 27 },
    { "uid": 1001, "gid": 27 }
  ]
};

Data structures can be built on top of the base data and reference it. E.g. the record data in the "member" array could be referenced by a "member_uid" map like this:

var member_map = {
  "1000": &{ "uid": 1000, "wiki_name": "John", "join_date": "2011-07-25" },
  "1001": &{ "uid": 1001, "wiki_name": "Tasaio", "join_date": "2011-07-26" }
};

Such maps can be constructed automatically when we know which columns are keys. (We could also potentially auto-detect keys when values in a column are all unique, and perhaps also suitably concise.)

Descriptor cells

A descriptor cell is a colon delimited set of values. Values that are not specified are ignored.

Table descriptor cell

The first descriptor cell is the table descriptor cell. It's first value is the table name. Other values are not used presently.

Column descriptor cell

The first line in a table (the table header) contains the column descriptor cells (which are tab-delimited).

The first value in a column descriptor cell is the column name. The second (optional) value is the column type (default type is 'string'). Other values are not used presently.

Column types

Supported data-types are:

  • bool
  • int
  • float
  • string
  • text (URL encoded, and auto-decoded)
  • json (JSON encoded, and auto-decoded)
    • note: json inputs must not contain control characters (e.g. tab/new-line), if that can happen JSON inputs will need to be URL encoded JSON, requires investigation.

Other types to consider (not used presently):

  • date
  • time
  • datetime
  • base64 (encoded)
  • md5, sha256, etc.
  • regex (regular expression)

Column attributes

Supported column attributes are:

  • key (indicates column (of any type) is unique and can be used as an identifier)

Other attributes to consider (not used presently):

  • index (create an array of records indexed by a given field)

Fields (data cells)

So the meat of the table is in the fields of the data rows. Rows continue after the table header until a blank line. Each field is tab-delimited. Leading and trailing white-space is to be ignored.