google SRE book notes No ratings yet.

google SRE book notes: Risk measure aggregate availabilty = successful requests/total requests ( instead of uptime/downtime) release: Branching All code is checked into the main branch of the source code tree (mainline). However, most major projects don’t release directly from the mainline. Instead, we branch from the mainline at a specific revision and never • Read More »

kafka msg format, how to publish, read No ratings yet.

How to Publish a kafka msg Kafka from programmer point of view is: just topic, key, value , headers send(topic, value=None, key=None, headers=None, partition=None, timestamp_ms=None)[source] Publish a message to a topic. Parameters: topic (str) – topic where the message will be published value (optional) – message value. Must be type bytes, or be serializable to bytes via configured value_serializer. If value • Read More »

ML workflow and pipeline orchestration No ratings yet.

Kale – Aims at simplifying the Data Science experience of deploying Kubeflow Pipelines workflows. Flyte – Easy to create concurrent, scalable, and maintainable workflows for machine learning. MLRun – Generic mechanism for data scientists to build, run, and monitor ML tasks and pipelines. Prefect – A workflow management system, designed for modern infrastructure. ZenML – An extensible open-source MLOps framework • Read More »

responsible AI No ratings yet.

General intro   Open source implementation Responsible AI Toolkits for AI Ethics & Privacy TensorFlow Privacy TensorFlow Privacy is a Python library that includes implementations of TensorFlow optimizers for training machine learning models with differential privacy. TensorFlow Federated TFF has been developed to facilitate open research and experimentation with Federated Learning • Read More »

open source linter and code coverage for C/C++ No ratings yet.

A poor man C/C++ linter and code coverage (gtests) C/C++ Linter: Cppcheck apt-get install cppcheck cppcheck –enable=all /your_cpp_source_dir   Test Code coverage : gcov/lcov g++ -o main –fprofile-arcs -ftest-coverage main_test.cpp -L /usr/lib -I/usr/include ./main gcov main_test.cpp lcov –coverage –directory . –output-file genhtml –output-directory out   CMake and code coverage for: • Read More »

column-oriented DB No ratings yet.

Free and open-source software Columnar DB   Database Name Language Implemented in Notes Apache Druid Java started in 2011 for low-latency massive ingestion and queries Apache Kudu C++ released in 2016 to complete the Apache Hadoop ecosystem Apache Pinot Java open sourced in 2015 for real-time low-latency analytics Calpont InfiniDB C++ ClickHouse C++ released in 2016 to • Read More »

zookeeper vs etcd 1/5 (1)

Use cases both provide strong consistance for key/value store. zookeeper use ZAB, etcd use raft, usually one leader. normally use as configure store Zookeeper more like file system bin/ -server LD_LIBRARY_PATH=. cli_mt set /zk_test junk get /zk etcd: etcdctl put greeting “Hello, etcd” etcdctl get greeting Documents Please rate this • Read More »

understand CAP theorem No ratings yet.

The CAP theorem states that a distributed system cannot simultaneously be consistent, available, and partition tolerant No distributed system is safe from network failures, thus network partitioning generally has to be tolerated.[7][8] In the presence of a partition, one is then left with two options: consistency or availability. CAP is often misunderstood as a choice at all times of • Read More »

Raft consensus algorithm on distributed system No ratings yet.

Raft: paxos hard to understand, new consensus algorithm consensus algorithm:  Leader elections, log replicate   This Raft library is stable and feature complete. As of 2016, it is the most widely used Raft library in production, serving tens of thousands clusters each day. It powers distributed systems such as etcd, Kubernetes, Docker Swarm, Cloud Foundry Diego, • Read More »