EricZ's repos on GitHub
Python · 2807 人关注
datasketch
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
Python · 603 人关注
SetSimilaritySearch
All-pair set similarity search on millions of sets in Python and on a laptop
Go · 107 人关注
lsh
Locality Sensitive Hashing for Go (Multi-probe LSH, LSH Forest, basic LSH)
Go · 60 人关注
lshensemble
LSH index for approximate set containment search
Go · 35 人关注
go-fasttext
Facebook fastText database in SQLite with Go API
Go · 27 人关注
go-sql-lsh
Locality Sensitive Hashing using Golang and SQL database
Go · 18 人关注
josie
Code and Benchmarks for JOSIE (SIGMOD 2019)
Go · 11 人关注
go-datasketch
Probabilistic data structures for processing very large datasets (MinHash, HyperLogLog)
JavaScript · 10 人关注
planning-poker
Planning Poker game for scrum team planning using Meteor.js
Go · 9 人关注
datatable
An in-memory relational table in Go similar to C#'s System.Data.DataTable.
Go · 4 人关注
counter
A frequency counter similar to Python's collections.Counter with additional support of other statistics.
Python · 3 人关注
rfc6266
Content-Disposition header support for Python
Python · 2 人关注
automl-gs
Provide an input CSV and a target field to predict, generate a model + code to run it.
Python · 2 人关注
gpt_index
GPT Index (LlamaIndex) is a project consisting of a set of data structures designed to make it easier to use large external knowledge bases with LLMs.
Java · 2 人关注
secxbrl
Download SEC XBRL Filings
Python · 1 人关注
agent-framework
A framework for building, orchestrating and deploying AI agents and multi-agent workflows with support for Python and .NET.
1 人关注
AgenticCookBook
The “Agentic Cookbook for Generative AI Agent usage” is a comprehensive guide designed to empower users with the knowledge and tools to effectively implement and utilize Generative AI Agents within their workflows.
1 人关注
autogen-ext-mcp
Turns Model Context Protocol server tools available in AutoGen >= v0.4
Jupyter Notebook · 1 人关注
FLAML
A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.
1 人关注
garnet
Garnet is a remote cache-store from Microsoft Research that offers strong performance (throughput and latency), scalability, storage, recovery, cluster sharding, key migration, and replication features. Garnet can work with existing Redis clients.
Go · 1 人关注
go-minhash
BottomK minwise hashing for streaming set similarity
C++ · 1 人关注
mldb
MLDB is the Machine Learning Database
Python · 1 人关注
nserc-subjects
Use NSERC award application summaries to predict research subjects
Jupyter Notebook · 0 人关注
autogen
Enable Next-Gen Large Language Model Applications. Join our Discord: https://discord.gg/pAbnFJrkgZ
0 人关注
Autogen_GraphRAG_Ollama
Microsoft's GraphRAG + AutoGen + Ollama + Chainlit = Fully Local & Free Multi-Agent RAG Superbot
0 人关注
awesome-llm-apps
Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.
Jupyter Notebook · 0 人关注
big-ann-benchmarks
Framework for evaluating ANNS algorithms on billion scale datasets.
Java · 0 人关注
bigdata-interop
Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.
JavaScript · 0 人关注
binaryworm
A small web game inspired by a puzzle.
Go · 0 人关注
binsort
Binsort is a tool to sort files of fixed-length binary records
C · 0 人关注
bitarray
efficient arrays of booleans for Python
Python · 0 人关注
ckanapi
A command line interface and Python module for accessing the CKAN Action API
TeX · 0 人关注
csc373ta
Tutorial materials for CSC373
Rust · 0 人关注
differential-dataflow
An implementation of differential dataflow using timely dataflow on Rust.
0 人关注
DiskANN
Graph-structured Indices for Scalable, Fast, Fresh and Filtered Approximate Nearest Neighbor Search
0 人关注
FinRobot
FinRobot: An Open-Source AI Agent Platform for Financial Analysis using LLMs 🚀 🚀 🚀
0 人关注
gitignore
A collection of useful .gitignore templates
Go · 0 人关注
go-mysql-server
An extensible MySQL server implementation in Go.
C · 0 人关注
go-sqlite3
sqlite3 driver for go that using database/sql
Python · 0 人关注
gpt-2
Code for the paper "Language Models are Unsupervised Multitask Learners"
0 人关注
h2ogpt
Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
Python · 0 人关注
hgobequip
Equipment management panel for Hanggao Observatory
0 人关注
hnswlib
Header-only C++/python library for fast approximate nearest neighbors
CSS · 0 人关注
indepth
Astrophotography Gallery for Hanggao Observatory
Go · 0 人关注
lane
A golang queues, stacks and deques implementation library
Python · 0 人关注
langchain
⚡ Building applications with LLMs through composability ⚡
JavaScript · 0 人关注
leon
🧠 Leon is your open-source personal assistant.
JavaScript · 0 人关注
luceneutil
Various utility scripts for running Lucene performance tests
Python · 0 人关注
messytables
Tools for parsing messy tabular data. This is now superseded by https://github.com/frictionlessdata/tabulator-py
0 人关注
msticpy
Microsoft Threat Intelligence Security Tools
C++ · 0 人关注
nmslib
Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.
0 人关注
OptiGuide
Large Language Models for Supply Chain Optimization
Go · 0 人关注
pgfutter
Import CSV and JSON into PostgreSQL the easy way
Python · 0 人关注
pg_probackup
Backup and recovery manager for PostgreSQL
C · 0 人关注
pg_query_state
Tool for query progress monitoring in PostgreSQL
C · 0 人关注
pg_similarity
set of functions and operators for executing similarity queries
Shell · 0 人关注
postgres
Docker Official Image packaging for Postgres
Python · 0 人关注
promptflow
Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
Go · 0 人关注
prototool
Your Swiss Army Knife for Protocol Buffers
Go · 0 人关注
pulumi
Define cloud apps and infrastructure in your favorite language and deploy to any cloud
Python · 0 人关注
pysparnn
Approximate Nearest Neighbor Search for Sparse Data in Python!
Python · 0 人关注
sampleproject
A sample project that exists for PyPUG's "Tutorial on Packaging and Distributing Projects"
Python · 0 人关注
spinningup
An educational resource to help anyone learn deep reinforcement learning.
Python · 0 人关注
sqlify
Create a SQLite database from an Excel spreadsheet
Python · 0 人关注
stylegan
StyleGAN - Official TensorFlow Implementation
Python · 0 人关注
tabulator-py
Consistent interface for stream reading and writing tabular data (csv/xls/json/etc).