Working Backwards ( insights, stories and secrets from inside Amazon ) notes No ratings yet.

Working Backwards ( insights, stories and secrets from inside Amazon ) by: Colin bryar and bill carr. https://www.amazon.com/Working-Backwards-Insights-Stories-Secrets/dp/1250267595 It provided very good insights how amazon works   (1) Bar Raiser for hiring process: it put one experienced interviewer in the hiring, and bar raiser can coach / teach the interview process  to avoid hiring low • Read More »


The leader habit No ratings yet.

The book “The leader habit” provide some good insight how to become a leader. Some good points: Delegate well :  person’s skill, his interests, and identify what needs to be accomplished, but let person figure it how. Sell the vision: paint a picture for 3-5 years, and make vision relevant to your followers by appealing • Read More »


Several Data Engineer and Machine Learning Engineer Positions ( from Visa) No ratings yet.

Please apply it directly at: https://jobs.smartrecruiters.com/Visa/743999808293049-staff-machine-learning-engineer-visa-ai-platform https://jobs.smartrecruiters.com/Visa/743999831550334-senior-machine-learning-engineer-visa-ai-as-a-service https://jobs.smartrecruiters.com/Visa/743999808350524-staff-data-engineer-visa-ai-platform https://jobs.smartrecruiters.com/Visa/743999808340022-principal-machine-learning-engineer Please rate this rating


google SRE book notes No ratings yet.

google SRE book notes: https://sre.google/sre-book/table-of-contents/ Risk measure aggregate availabilty = successful requests/total requests ( instead of uptime/downtime) release: Branching All code is checked into the main branch of the source code tree (mainline). However, most major projects don’t release directly from the mainline. Instead, we branch from the mainline at a specific revision and never • Read More »


kafka msg format, how to publish, read No ratings yet.

How to Publish a kafka msg Kafka from programmer point of view is: just topic, key, value , headers https://kafka-python.readthedocs.io/en/master/apidoc/KafkaProducer.html send(topic, value=None, key=None, headers=None, partition=None, timestamp_ms=None)[source] Publish a message to a topic. Parameters: topic (str) – topic where the message will be published value (optional) – message value. Must be type bytes, or be serializable to bytes via configured value_serializer. If value • Read More »


ML workflow and pipeline orchestration No ratings yet.

Kale – Aims at simplifying the Data Science experience of deploying Kubeflow Pipelines workflows. Flyte – Easy to create concurrent, scalable, and maintainable workflows for machine learning. MLRun – Generic mechanism for data scientists to build, run, and monitor ML tasks and pipelines. Prefect – A workflow management system, designed for modern infrastructure. ZenML – An extensible open-source MLOps framework • Read More »



responsible AI No ratings yet.

General intro https://github.com/alexandrainst/responsible-ai https://ai.google/responsibilities/responsible-ai-practices/ https://www.tensorflow.org/responsible_ai   Open source implementation https://github.com/microsoft/responsible-ai-toolbox https://www.tensorflow.org/responsible_ai/api_docs https://opendatascience.com/15-open-source-responsible-ai-toolkits-and-projects-to-use-today/ Responsible AI Toolkits for AI Ethics & Privacy TensorFlow Privacy TensorFlow Privacy is a Python library that includes implementations of TensorFlow optimizers for training machine learning models with differential privacy. TensorFlow Federated TFF has been developed to facilitate open research and experimentation with Federated Learning • Read More »


open source linter and code coverage for C/C++ No ratings yet.

A poor man C/C++ linter and code coverage (gtests) C/C++ Linter: Cppcheck apt-get install cppcheck cppcheck –enable=all /your_cpp_source_dir   Test Code coverage : gcov/lcov g++ -o main –fprofile-arcs -ftest-coverage main_test.cpp -L /usr/lib -I/usr/include ./main gcov main_test.cpp lcov –coverage –directory . –output-file main_coverage.info genhtml main_coverage.info –output-directory out   https://medium.com/@naveen.maltesh/generating-code-coverage-report-using-gnu-gcov-lcov-ee54a4de3f11 https://dr-kino.github.io/2019/12/22/test-coverage-using-gtest-gcov-and-lcov/   CMake and code coverage for: • Read More »


column-oriented DB No ratings yet.

Free and open-source software Columnar DB   Database Name Language Implemented in Notes Apache Druid Java started in 2011 for low-latency massive ingestion and queries Apache Kudu C++ released in 2016 to complete the Apache Hadoop ecosystem Apache Pinot Java open sourced in 2015 for real-time low-latency analytics Calpont InfiniDB C++ ClickHouse C++ released in 2016 to • Read More »