Working Backwards ( insights, stories and secrets from inside Amazon ) notes

Working Backwards ( insights, stories and secrets from inside Amazon ) by: Colin bryar and bill carr. It provided very good insights how amazon works   (1) Bar Raiser for hiring process: it put one experienced interviewer in the hiring, and bar raiser can coach / teach the interview process  to avoid hiring low

The leader habit

The book "The leader habit" provide some good insight how to become a leader. Some good points: Delegate well :  person's skill, his interests, and identify what needs to be accomplished, but let person figure it how. Sell the vision: paint a picture for 3-5 years, and make vision relevant to your followers by appealing

google SRE book notes

google SRE book notes: Risk measure aggregate availabilty = successful requests/total requests ( instead of uptime/downtime) release: Branching All code is checked into the main branch of the source code tree (mainline). However, most major projects don't release directly from the mainline. Instead, we branch from the mainline at a specific revision and never

kafka msg format, how to publish, read

How to Publish a kafka msg Kafka from programmer point of view is: just topic, key, value , headers send(topic, value=None, key=None, headers=None, partition=None, timestamp_ms=None)[source] Publish a message to a topic. Parameters: topic (str) – topic where the message will be published value (optional) – message value. Must be type bytes, or be serializable to bytes via configured value_serializer. If value

ML workflow and pipeline orchestration

Kale – Aims at simplifying the Data Science experience of deploying Kubeflow Pipelines workflows. Flyte – Easy to create concurrent, scalable, and maintainable workflows for machine learning. MLRun – Generic mechanism for data scientists to build, run, and monitor ML tasks and pipelines. Prefect – A workflow management system, designed for modern infrastructure. ZenML – An extensible open-source MLOps framework

responsible AI

General intro   Open source implementation Responsible AI Toolkits for AI Ethics & Privacy TensorFlow Privacy TensorFlow Privacy is a Python library that includes implementations of TensorFlow optimizers for training machine learning models with differential privacy. TensorFlow Federated TFF has been developed to facilitate open research and experimentation with Federated Learning

open source linter and code coverage for C/C++

A poor man C/C++ linter and code coverage (gtests) C/C++ Linter: Cppcheck apt-get install cppcheck cppcheck –enable=all /your_cpp_source_dir   Test Code coverage : gcov/lcov g++ -o main –fprofile-arcs -ftest-coverage main_test.cpp -L /usr/lib -I/usr/include ./main gcov main_test.cpp lcov –coverage –directory . –output-file genhtml –output-directory out   CMake and code coverage for:

column-oriented DB

Free and open-source software Columnar DB   Database Name Language Implemented in Notes Apache Druid Java started in 2011 for low-latency massive ingestion and queries Apache Kudu C++ released in 2016 to complete the Apache Hadoop ecosystem Apache Pinot Java open sourced in 2015 for real-time low-latency analytics Calpont InfiniDB C++ ClickHouse C++ released in 2016 to