Data Types permitidos en las integraciones con Tableau, PowerBi y Cristal Reports: powerviews

josepato · August 25, 2021, 8:24pm

Powerviews requiere especificar el formato (schema) que se usará para:

Interpretar el JSON que se recibe de entrada (el input_schema). La entrada se recibe de mongodb o couchdb (couchdb en desarrollo al 20210825).
Definir qué columnas y con qué formato se mostrarán en la vista final a la que tendrá acceso el usuario (el output_schema)

Los formatos de entrada (input_schema) y de salida (output_schema) tienen diferencias de sintaxis, debido a que el output_schema necesita especificar en caso de un subkey json cuál es el valor que se extraerá.

input_schema

El input_schema es un arreglo de objetos JSON ([obj1, obj2, ...]) donde el órden es relevante.
Cada uno de los elementos objN tiene el formato {key: input_type} donde key define cuál es la llave que se extraerá del JSON de entrada. Tanto key como input_type deben hacer match a la siguiente regex:

^[a-zA-Z][a-zA-Z0-9_]+$

Adicionalmente input_type debe ser una cadena de las que aparecen en la columna Name o Aliases de la siguiente lista:

Tabla 1

Name	Aliases	Description
bigint	int8	signed eight-byte integer
bigserial	serial8	autoincrementing eight-byte integer
bit		fixed-length bit string
bit varying	varbit	variable-length bit string
boolean	bool	logical Boolean (true/false)
box		rectangular box on a plane
bytea		binary data (“byte array”)
character	char	fixed-length character string
character varying	varchar	variable-length character string
cidr		IPv4 or IPv6 network address
circle		circle on a plane
date		calendar date (year, month, day)
double precision	float8	double precision floating-point number (8 bytes)
inet		IPv4 or IPv6 host address
integer	int, int4	signed four-byte integer
interval		time span
json		textual JSON data
jsonb		binary JSON data, decomposed
line		infinite line on a plane
lseg		line segment on a plane
macaddr		MAC (Media Access Control) address
macaddr8		MAC (Media Access Control) address (EUI-64 format)
money		currency amount
numeric	decimal	exact numeric of selectable precision
path		geometric path on a plane
pg_lsn		PostgreSQL Log Sequence Number
point		geometric point on a plane
polygon		closed geometric path on a plane
real	float4	single precision floating-point number (4 bytes)
smallint	int2	signed two-byte integer
smallserial	serial2	autoincrementing two-byte integer
serial	serial4	autoincrementing four-byte integer
text		variable-length character string
time		time of day (no time zone)
time with time zone	timetz	time of day, including time zone
timestamp		date and time (no time zone)
timestamp with time zone	timestamptz	date and time, including time zone
tsquery		text search query
tsvector		text search document
txid_snapshot		user-level transaction ID snapshot
uuid		universally unique identifier
xml		XML data

Para más información revisar Postgres Data Types. Nótese que los campos opcionales entre paréntesis para algunos tipos de datos cómo varchar(n) no están soportados actualmente.

Adicionales para input_type: arreglos

Para soportar arreglos se tiene la sintaxis especial (en la entrada) de:

array_DATATYPE

Donde array es la cadena literal array y DATATYPE es un valor de la Tabla 1

output_schema

El output_schema es un arreglo de objetos JSON ([obj1, obj2, ...]) donde el órden es relevante.
Cada uno de los elementos objN tiene el formato {key: output_type} donde key define cuál es la llave que se extraerá del JSON de entrada. Tanto key como output_type deben hacer match a la siguiente regex:

^[a-zA-Z][a-zA-Z0-9_]+$

Adicionalmente output_type debe ser una cadena de las que aparecen en la columna Name o Aliases de la siguiente lista:

Tabla 2

Name	Aliases	Description
bigint	int8	signed eight-byte integer
bigserial	serial8	autoincrementing eight-byte integer
bit		fixed-length bit string
bit varying	varbit	variable-length bit string
boolean	bool	logical Boolean (true/false)
box		rectangular box on a plane
bytea		binary data (“byte array”)
character	char	fixed-length character string
character varying	varchar	variable-length character string
cidr		IPv4 or IPv6 network address
circle		circle on a plane
date		calendar date (year, month, day)
double precision	float8	double precision floating-point number (8 bytes)
inet		IPv4 or IPv6 host address
integer	int, int4	signed four-byte integer
interval		time span
json		textual JSON data
jsonb		binary JSON data, decomposed
line		infinite line on a plane
lseg		line segment on a plane
macaddr		MAC (Media Access Control) address
macaddr8		MAC (Media Access Control) address (EUI-64 format)
money		currency amount
numeric	decimal	exact numeric of selectable precision
path		geometric path on a plane
pg_lsn		PostgreSQL Log Sequence Number
point		geometric point on a plane
polygon		closed geometric path on a plane
real	float4	single precision floating-point number (4 bytes)
smallint	int2	signed two-byte integer
smallserial	serial2	autoincrementing two-byte integer
serial	serial4	autoincrementing four-byte integer
text		variable-length character string
time		time of day (no time zone)
time with time zone	timetz	time of day, including time zone
timestamp		date and time (no time zone)
timestamp with time zone	timestamptz	date and time, including time zone
tsquery		text search query
tsvector		text search document
txid_snapshot		user-level transaction ID snapshot
uuid		universally unique identifier
xml		XML data

Para más información revisar Postgres Data Types. Nótese que los campos opcionales entre paréntesis para algunos tipos de datos cómo varchar(n) no están soportados actualmente.

output_schema output_type adicionales

output_type también puede incluir adicionalmente el siguiente formato especial delimitado por __ (doble guión bajo):

otype__field

Donde otype es una de las cadenas de la Tabla 1 y field puede ser un entero sin signo (0, 1, 2, …) o una cadena libre que señale el nombre de un key JSON (del JSON que se usó en input_schema).

Ejemplos

Asumiendo que se recibe el siguiente JSON de la base de datos:

{
  "fecha": "2021-08-25",
  "archivo": {
    "nombre": "archivo importante",
    "url": "https://example.com/url" 
  },
  "operaciones": [ 100, 200, 300, 400],
  "saldo": "$523",
}

Y se desea presentar este JSON en la vista hacia el usuario final de la siguiente manera:

        fecha        |  saldo  | op1 | op2 |   nombre_archivo   |       url_archivo       
---------------------+---------+-----+-----+--------------------+-------------------------
 2021-08-25 00:00:00 | $523.00 | 100 | 200 | archivo_importante | https://example.com/url

Se necestarían los siguientes input_schema y output_schema:

input_schema:

[
  {"fecha": "timestamp"},
  {"saldo": "money"},
  {"operaciones": "array_int"},
  {"operaciones": "array_int"},
  {"archivo": "json"},
  {"archivo": "json"},  
]

output_schema:

[
  {"fecha": "timestamp"},
  {"saldo": "money"},
  {"op1": "int__0"},
  {"op2": "int__1"},
  {"nombre_archivo": "json__nombre"},
  {"url_archivo": "json__url"}
]