Chinese Yellow Pages | Classifieds | Knowledge | Tax | IME

column-oriented DB

Free and open-source software Columnar DB

Database Name Language Implemented in Notes
Apache Druid Java started in 2011 for low-latency massive ingestion and queries
Apache Kudu C++ released in 2016 to complete the Apache Hadoop ecosystem
Apache Pinot Java open sourced in 2015 for real-time low-latency analytics
Calpont InfiniDB C++
ClickHouse C++ released in 2016 to analyze data that is updated in real time
CrateDB Java
DuckDB C++ An embeddable, in-process, column-oriented SQL OLAP RDBMS
Databend Rust An elastic and reliable Serverless Data Warehouse
InfluxDB Go time series database
Greenplum Database C
PostgreSQL cstore fdw,[1] vops[2] C cstore_fdw uses ORC format
MariaDB ColumnStore C & C++ formerly Calpont InfiniDB
MapD C++
Metakit C++
MonetDB C


Columns store db:

are mainly relational, primarily used for data warehouse applications

store /access as columns

optimized for answering questions about a single attribute

MariaDB ColumnStore

it could be select, sum join, where , group by order by etc  on columnar db:

select c.c_mktsegment cust_mkt_segment,

sum(o.o_totalprice) total_order_amount

from orders o join customer c on o.o_custkey = c.c_custkey

where c.c_nationkey = 24

and o.o_orderDATE >= ‘2016-10-01’

and o.o_orderDATE < ‘2017-01-01’

group by total_order_amount

order by total_order_amount;


Bigtable/HBase/Cassandra ( Column Family db):

it is a column family db,( based on google’s bigtable struct)

not columnar-db

good for big data on hadoop, better more than 5 servers

and stored/accessed as row ( or column/row mixed )

optimized for answers questions about a row

CF1          CF2

col1, col2    col3

where CF1, CF2 are stored as column

and col1/2 are stored as row

hbase shell:

create ‘customers’, ‘details’,’relatives’,’accounts’

# put <table>,<row key>, <CF:Qualifier>, <Value>  //insert or update the row

put ‘customers’, ‘101’,’details:name’, ‘Adam’  # can only write one qualifier/value at one time

put ‘customers’, ‘101’,’details:Email’, ‘’

scan ‘customers’, {COLUMNS => ‘details’}

alter ‘customers’, {NAME =>’accounts’, VERSIONS =>5 } # enable 5 versions of accounts

get ‘customers’, ‘102’, { COLUMN=>’accounts:checking’, VERSION=>5} #return multiple versions

delete ‘customers’, ‘101’, ‘accounts:business’ # delete that column family’s qualifier/attributes


scan ‘customers’, {COLUMNS=>’details:name’, FILTER => “ValueFilter(>,’binaryprefix:B’)”}




Please rate this

Comments are closed here.