HiveNode is the worker component of the Hive system. It connects to a central HiveCore proxy and runs local inference using Ollama. By running HiveNode on any machine (on-premise, cloud, or behind firewalls), you can seamlessly join that machine’s compute resources to the HiveCore network, making it available to serve requests routed by the central proxy.
- Overview
- Key Features
- Installation & Setup
- Configuration
- Running
- How it Works
- Logging & Monitoring
- Contributing
- License
In the Hive architecture:
- HiveCore serves as the central proxy and gateway, managing queues and distributing inference requests.
- HiveNode runs on worker machines. It connects out to HiveCore (so the worker does not need to be publicly accessible). Once connected, HiveNode polls inference jobs from HiveCore, which then uses its local Ollama server to process requests.
This design allows multiple machines, possibly spread across different networks, to operate as a single, unified inference cluster.
-
Local Ollama Integration
HiveNode processes requests by forwarding them to an Ollama instance running on the same machine (or accessible from that machine).
-
Multiple Concurrent Connections
Each HiveNode can open several parallel connections to HiveCore, letting you scale inference throughput per worker.
-
Centralized Configuration & Scaling
Workers require minimal configuration—just point them to HiveCore and set a valid key.
-
Extensible Logging
Built-in InfluxDB logging for GPU usage (via NVML) and system metrics if environment variables are set.
-
Prerequisites
- Rust toolchain (for building HiveNode)
- Ollama installed locally (or accessible at a defined URL). For some help for this see Configuration. Additinally you can use our provided script to handle installation and multi-instance setup.
- valid
Worker
key generated byHiveCore
. (See HiveCore’s management endpoints for instructions.)
-
Clone the repository
git clone https://github.com/VakeDomen/HiveNode.git cd HiveNode
-
Configure
- rename
.env.sample
to.env
mv .env.sample .env
- configure the
.env
environment as defined in Configuration
- rename
-
Run
- see Running
cargo run --release
A sample .env
might look like:
# The address and port of HiveCore’s node connection server (default 7777 in HiveCore).
HIVE_CORE_URL=hivecore.example.com:7777
# Worker key provided by HiveCore admin. Must have "Worker" role.
HIVE_KEY=my-secret-key
# URL to your local or remote Ollama instance, e.g. "http://localhost:11434"
OLLAMA_URL=http://localhost:11434
# Number of parallel connections to open to HiveCore. Best to match Ollama configuration
CONCURRENT_RQEUESTS=4
# (Optional) InfluxDB settings for logging
INFLUX_HOST=http://localhost:8086
INFLUX_ORG=MY_ORG
INFLUX_TOKEN=my-token
HIVE_CORE_URL
: Where HiveNode connects to HiveCore (must match HiveCore’sNODE_CONNECTION_PORT
, by default7777
).HIVE_KEY
: The Worker key from HiveCore’s admin interface. Required for authentication.OLLAMA_URL
: The local or remote address of the Ollama service. Typically, Ollama runs on the same machine at http://localhost:11434.CONCURRENT_RQEUESTS
: Sets how many parallel connections (and thus concurrent tasks) this HiveNode should proxy. Adjust based on your hardware resources and Ollama configuration.INFLUX_*
: (Optional) If configured, HiveNode will record logs and GPU usage metrics to InfluxDB. If not provided, it simply won’t log to Influx.
HiveNode expects an active Ollama server at the URL specified by OLLAMA_URL
in your .env
. The default installation of Ollama is enough if you only need one Ollama instance and want to place all requests through that instance.
If you have multiple GPUs or want to run several Ollama servers (perhaps each pinned to different GPUs), you can use our provided start_ollama.sh
helper script.
This script:
- Checks if netstat and curl are installed (and installs them if not).
- Installs Ollama if not already present.
- Lets you pick how many Ollama instances to run and how to assign GPUs to each instance.
- Finds free ports for each Ollama instance, runs it, and logs their output.
chmod +x start_ollama.sh
./start_ollama.sh
After configuring the .env
file, run:
cargo run --release
Or compile and execute binary directly:
cargo build --release
./target/release/hivenode
- Authentication
- On startup, each of the
CONCURRENT_RQEUESTS
threads tries to authenticate to HiveCore using the key inHIVE_KEY
. - Upon successful auth, HiveNode advertises its versions and its supported models to HiveCore.
- On startup, each of the
- Polling & Proxying
- HiveNode periodically polls HiveCore for incoming tasks. If HiveCore’s queue has work for a given model, it dispatches it to the node.
- HiveNode forwards the request to
OLLAMA_URL
for local inference (via the Ollama HTTP API), then streams the response back to HiveCore.
- Reconnection & Control
- If the connection drops or an error occurs, HiveNode waits briefly, then reconnects.
- HiveCore can issue commands like
REBOOT
orSHUTDOWN
, which HiveNode listens for in the incoming messages (handy for maintenance or upgrades).
- Scaling
- To allow more capacity on the same machine, increase the
CONCURRENT_RQEUESTS
count. - To add more workers across multiple machines, simply run additional HiveNode instances (each with its own .env and valid Worker key).
- Hive supports having multiple workers on the same machine, but each worker should have it's own token. (handy if multiple ollama servers are split among multiple GPUs)
- To allow more capacity on the same machine, increase the
HiveNode can log system metrics (like GPU usage, memory and requests proxied) to InfluxDB:
- Enable Influx Logging: Provide
INFLUX_HOST
,INFLUX_ORG
, andINFLUX_TOKEN
in the.env
. - GPU Metrics: HiveNode uses NVML to gather GPU info (power usage, memory usage, etc.). This is only collected if an NVIDIA GPU is present and NVML is available on the system.
- Request Streaming: All inference requests and responses can be logged with success/error tags.
These metrics are pushed to Influx in the background. If any of the Influx environment variables are missing or invalid, HiveNode just skips that monitoring.
We welcome pull requests! Before submitting, please open an issue to discuss your proposed changes. Make sure to:
- Keep code style consistent.
- Update documentation if adding or changing features.
HiveNode is distributed under the MIT License, just like HiveCore. See the LICENSE file in this repository for details.