Archives:

Several Data Engineer and Machine Learning Engineer Positions ( from Visa) No ratings yet.

Please apply it directly at: https://jobs.smartrecruiters.com/Visa/743999808293049-staff-machine-learning-engineer-visa-ai-platform https://jobs.smartrecruiters.com/Visa/743999831550334-senior-machine-learning-engineer-visa-ai-as-a-service https://jobs.smartrecruiters.com/Visa/743999808350524-staff-data-engineer-visa-ai-platform https://jobs.smartrecruiters.com/Visa/743999808340022-principal-machine-learning-engineer Please rate this rating


google SRE book notes No ratings yet.

google SRE book notes: https://sre.google/sre-book/table-of-contents/ Risk measure aggregate availabilty = successful requests/total requests ( instead of uptime/downtime) release: Branching All code is checked into the main branch of the source code tree (mainline). However, most major projects don’t release directly from the mainline. Instead, we branch from the mainline at a specific revision and never • Read More »


kafka msg format, how to publish, read No ratings yet.

How to Publish a kafka msg Kafka from programmer point of view is: just topic, key, value , headers https://kafka-python.readthedocs.io/en/master/apidoc/KafkaProducer.html send(topic, value=None, key=None, headers=None, partition=None, timestamp_ms=None)[source] Publish a message to a topic. Parameters: topic (str) – topic where the message will be published value (optional) – message value. Must be type bytes, or be serializable to bytes via configured value_serializer. If value • Read More »


ML workflow and pipeline orchestration No ratings yet.

Kale – Aims at simplifying the Data Science experience of deploying Kubeflow Pipelines workflows. Flyte – Easy to create concurrent, scalable, and maintainable workflows for machine learning. MLRun – Generic mechanism for data scientists to build, run, and monitor ML tasks and pipelines. Prefect – A workflow management system, designed for modern infrastructure. ZenML – An extensible open-source MLOps framework • Read More »



responsible AI No ratings yet.

General intro https://github.com/alexandrainst/responsible-ai https://ai.google/responsibilities/responsible-ai-practices/ https://www.tensorflow.org/responsible_ai   Open source implementation https://github.com/microsoft/responsible-ai-toolbox https://www.tensorflow.org/responsible_ai/api_docs https://opendatascience.com/15-open-source-responsible-ai-toolkits-and-projects-to-use-today/ Responsible AI Toolkits for AI Ethics & Privacy TensorFlow Privacy TensorFlow Privacy is a Python library that includes implementations of TensorFlow optimizers for training machine learning models with differential privacy. TensorFlow Federated TFF has been developed to facilitate open research and experimentation with Federated Learning • Read More »


open source linter and code coverage for C/C++ No ratings yet.

A poor man C/C++ linter and code coverage (gtests) C/C++ Linter: Cppcheck apt-get install cppcheck cppcheck –enable=all /your_cpp_source_dir   Test Code coverage : gcov/lcov g++ -o main –fprofile-arcs -ftest-coverage main_test.cpp -L /usr/lib -I/usr/include ./main gcov main_test.cpp lcov –coverage –directory . –output-file main_coverage.info genhtml main_coverage.info –output-directory out   https://medium.com/@naveen.maltesh/generating-code-coverage-report-using-gnu-gcov-lcov-ee54a4de3f11 https://dr-kino.github.io/2019/12/22/test-coverage-using-gtest-gcov-and-lcov/   CMake and code coverage for: • Read More »


column-oriented DB No ratings yet.

Free and open-source software Columnar DB   Database Name Language Implemented in Notes Apache Druid Java started in 2011 for low-latency massive ingestion and queries Apache Kudu C++ released in 2016 to complete the Apache Hadoop ecosystem Apache Pinot Java open sourced in 2015 for real-time low-latency analytics Calpont InfiniDB C++ ClickHouse C++ released in 2016 to • Read More »


zookeeper vs etcd 1/5 (1)

Use cases both provide strong consistance for key/value store. zookeeper use ZAB, etcd use raft, usually one leader. normally use as configure store Zookeeper more like file system https://zookeeper.apache.org/doc/r3.3.6/zookeeperStarted.html bin/zkCli.sh -server 127.0.0.1:2181 LD_LIBRARY_PATH=. cli_mt 127.0.0.1:2181 set /zk_test junk get /zk etcd: https://etcd.io/docs/v3.5/quickstart/ etcdctl put greeting “Hello, etcd” etcdctl get greeting Documents https://www.youtube.com/watch?v=BhosKsE8up8&ab_channel=BitTiger%E5%AE%98%E6%96%B9%E9%A2%91%E9%81%93BitTigerOfficialChannel Please rate this • Read More »


understand CAP theorem No ratings yet.

The CAP theorem states that a distributed system cannot simultaneously be consistent, available, and partition tolerant No distributed system is safe from network failures, thus network partitioning generally has to be tolerated.[7][8] In the presence of a partition, one is then left with two options: consistency or availability. CAP is often misunderstood as a choice at all times of • Read More »