 Security

Administrator responsibilities

As an EdgeSet administrator, you must do the following to maintain the security of your instance:

Treat your license key as a secret

EdgeSet backups are always encrypted. However, they can be decrypted by anyone who can use your license key to create their own EdgeSet instance.

Warning

Never store your license key in the same place you store your backups.

Review saved queries before taking ownership of them

Saved queries execute with the permissions of the query’s owner. If you take ownership of a saved query, everyone who has access to that query will be able to view any data retrieved by the query, regardless of their permissions.

Why do saved queries run as the query owner?

Why not run saved queries with the permissions of the person running them? There are 2 reasons:

It prevents users from running untrusted queries that might have unintended side effects (leading to privilege escalation for the query owner).
It allows users with access to sensitive data to share safe aggregated or masked views of the data with others.

Security features

EdgeSet is designed for security and privacy. It is “local-first” software that does not require an internet connection and can run in air-gapped environments.

Data sovereignty

Only you have access to your data. EdgeSet does not send your data to Tetmon or any third-party servers or services.
Neither Tetmon nor Tetmon’s partners have access to your EdgeSet instance (unless you deliberately create a user account for them).
EdgeSet is hypervisor-agnostic. It can be run on-prem or in the cloud provider of your choice in any geographic region.
There are no backdoors, spyware, or data exfiltration channels in EdgeSet, nor has Tetmon Pte. Ltd. been ordered to install any backdoors, spyware, or data exfiltration channels.

Credentials, passwords, and keys

Data source credentials (passwords, keys, etc.) are encrypted using the ChaCha20 cipher (256-bit security, stronger than AES-256, selected by Google for use in HTTP3) when they are saved to disk.
When data source credentials are held in memory, they are marked so that they cannot be accidentally saved to disk or printed in logs and so that the memory location that they occupied is scrubbed when freed.
Data source credentials are never transmitted to users, not even to administrators, not even when editing the data source.
Any account passwords are stored as cryptographic one-way hashes, with individual salts, and using a modern memory-hard hashing function (Argon2id) with GPU resistance and protection from side-channel attacks.
2FA is available for user accounts.

Access controls

All access to the system (web, API, or otherwise) requires authentication.
EdgeSet’s permission system is secure by default: users are not automatically granted access, even to new data sources.
Permissions are granular: restrictions can be applied at the data source, folder, table, and column level.
Every EdgeSet update has undergone automated permissions system correctness testing.
All queries are logged.

Encryption

EdgeSet requires encryption for all inbound connections.
Insecure ciphers and key exchange algorithms are disabled.
All files containing query results (saved query results, temporary files) are encrypted with individual (derived) encryption keys to protect against known-plaintext attacks.
Backup files are encrypted (ChaCha20).
Each EdgeSet installation uses a unique encryption key.
Symmetric keys are held in memory only or stored encrypted (by envelope encryption).
EdgeSet can register its own TLS certificate via Let’s Encrypt (HTTP challenge), create a self-signed certificate, or be proxied by an HTTPS-terminating load balancer.

Firewall

EdgeSet inbound (ingress) firewall rules
Port	Protocol	Purpose	Required
22	TCP	SSH-compatible interface	no
80	TCP	Setup web interface	only during setup, for Let’s Encrypt HTTP challenges, or when proxied by an HTTPS-terminating load balancer
443	TCP	Web interface + Presto-compatible API	yes
5432	TCP	PostgreSQL-compatible interface	no
type 8	ICMP	ping	no

EdgeSet runs its own firewall and can additionally be placed behind other firewalls.
EdgeSet can without internet access (“air-gapped”).

While EdgeSet does not require access to the internet, some optional functionality does depend on outbound access.

Optional outbound (egress) firewall rules
Port	Protocol	Host	Purpose
443	TCP	build.tetmon.com	software updates
443	TCP	www.googleapis.com	Google Drive data sources
443	TCP	Amazon S3 regular endpoints	S3 data sources
443	TCP	Alibaba OSS endpoints	OSS data sources
443	TCP	BigQuery regional endpoints	BigQuery data sources
443	TCP	Amazon Redshift endpoints	Redshift data sources
443	TCP	account_identifier.snowflakecomputing.com	Snowflake data sources

Hardening

No shell access: the SSH-compatible interface on port 22 does not allow shell access (it is a menu-driven text UI that cannot be escaped from into a shell).
No user accounts: there are no system-level (Linux) user accounts on the EdgeSet server and the root user is not allowed to log in.
Signed software: all system software (including EdgeSet) are cryptographically signed by Tetmon.
Only whitelisted services: only services essential for the operation of EdgeSet run on the server.
Read-only applications: all system software (including EdgeSet) is mounted read-only and cannot be modified except by applying software updates cryptographically signed by Tetmon.
Read-only system logs: system log entries are timestamped are viewable (but not writable) by administrators.

Web interface

All web assets are self-hosted by the EdgeSet installation (no JavaScript libraries or other web assets are fetched from any third-party webserver).
Cross-site scripting (XSS) measures are embedded in EdgeSet.
Strict Content Security Policy headers are enforced.
Authentication cookies are tamper-proof, inaccessible to JavaScript, and sent over HTTPS only.
No passwords or credentials are stored on the web browser.

AI models

All AI models (including LLMs) run within EdgeSet’s VM (no model providers are used and no data is sent outside of EdgeSet for inference).
No AI model training is performed in EdgeSet and no data is transmitted from EdgeSet to Tetmon or any third-party server/service for model training.
EdgeSet’s LLMs are not vulnerable to prompt injection because they have constrained, pre-defined output domains (they cannot produce free-form responses or use tools).

Application

All application queries (queries that EdgeSet uses internally) are protected against SQL injection attacks by compile-time query preparation and type-checking.
All API endpoints are rate limited (with stricter rate limits for authentication endpoints).
Static source code security checks are performed for each EdgeSet update.
End-to-end security tests are performed for each EdgeSet update, including (but not limited to):
- penetration tests
- port scanning
- header forging
- API fuzzing (including OWASP rules)

Supply chain

A software dependency list, aka Software Bill of Materials, is viewable in the EdgeSet web interface.
All software dependencies are pinned by cryptographic hash.
Build scripts are not granted internet access (all software artifacts must be specified and pinned).
Tetmon’s Git and CI/CD servers are self-hosted. No cloud services (e.g. GitHub, GitLab) are used for any part of the EdgeSet build or update process.
All EdgeSet source code must pass a Git pull request approval process.
EdgeSet software engineers do not have master branch push access.
EdgeSet software engineers do not have SSH access to any build servers (software engineers can only ship code through the Git review and approval process).
All automated tests must pass before an update is made available (new releases are automatically gated by all automated tests).
EdgeSet software updates are cryptographically signed and verified (EdgeSet will not accept an update that has not been cryptographically signed by Tetmon).