Compression Guide¶
Omnistorage provides compression layers that wrap io.Writer and io.Reader.
Available Compressors¶
| Format | Package | Use Case |
|---|---|---|
| Gzip | compress/gzip | Universal compatibility |
| Zstandard | compress/zstd | Better compression ratio and speed |
Gzip Compression¶
Writing Compressed Data¶
import (
"github.com/grokify/omnistorage/backend/file"
"github.com/grokify/omnistorage/compress/gzip"
)
backend := file.New(file.Config{Root: "/data"})
// Create the writer stack
fileWriter, _ := backend.NewWriter(ctx, "data.txt.gz")
gzipWriter, _ := gzip.NewWriter(fileWriter)
// Write data
gzipWriter.Write([]byte("compressed content"))
gzipWriter.Close() // Important: closes both gzip and file writers
Reading Compressed Data¶
fileReader, _ := backend.NewReader(ctx, "data.txt.gz")
gzipReader, _ := gzip.NewReader(fileReader)
defer gzipReader.Close()
data, _ := io.ReadAll(gzipReader)
Compression Level¶
// Default compression
gzipWriter, _ := gzip.NewWriter(fileWriter)
// Best compression (slower)
gzipWriter, _ := gzip.NewWriterLevel(fileWriter, gzip.BestCompression)
// Best speed (larger files)
gzipWriter, _ := gzip.NewWriterLevel(fileWriter, gzip.BestSpeed)
// No compression (for testing)
gzipWriter, _ := gzip.NewWriterLevel(fileWriter, gzip.NoCompression)
Zstandard Compression¶
Zstandard (zstd) provides better compression ratio and faster decompression than gzip.
Writing with Zstd¶
import "github.com/grokify/omnistorage/compress/zstd"
fileWriter, _ := backend.NewWriter(ctx, "data.txt.zst")
zstdWriter, _ := zstd.NewWriter(fileWriter)
zstdWriter.Write([]byte("compressed content"))
zstdWriter.Close()
Reading with Zstd¶
fileReader, _ := backend.NewReader(ctx, "data.txt.zst")
zstdReader, _ := zstd.NewReader(fileReader)
defer zstdReader.Close()
data, _ := io.ReadAll(zstdReader)
Compression Level¶
// Default level
zstdWriter, _ := zstd.NewWriter(fileWriter)
// Custom level (1-22, default is 3)
zstdWriter, _ := zstd.NewWriterLevel(fileWriter, 10)
Combining with Format Layers¶
Stack compression with format layers:
import (
"github.com/grokify/omnistorage/backend/s3"
"github.com/grokify/omnistorage/compress/gzip"
"github.com/grokify/omnistorage/format/ndjson"
)
s3Backend, _ := s3.New(s3Config)
// Create writer stack: S3 -> Gzip -> NDJSON
raw, _ := s3Backend.NewWriter(ctx, "logs/2024-01-08.ndjson.gz")
compressed := gzip.NewWriter(raw)
writer := ndjson.NewWriter(compressed)
// Write records
for _, record := range records {
data, _ := json.Marshal(record)
writer.Write(data)
}
writer.Close() // Closes entire stack
Reading the Stack¶
raw, _ := s3Backend.NewReader(ctx, "logs/2024-01-08.ndjson.gz")
decompressed, _ := gzip.NewReader(raw)
reader := ndjson.NewReader(decompressed)
for {
record, err := reader.Read()
if err == io.EOF {
break
}
process(record)
}
reader.Close()
Choosing a Compressor¶
| Factor | Gzip | Zstd |
|---|---|---|
| Compatibility | Universal | Growing |
| Compression ratio | Good | Better |
| Compression speed | Moderate | Fast |
| Decompression speed | Moderate | Very fast |
| Memory usage | Low | Low-Medium |
When to Use Gzip¶
- Compatibility is important (web servers, browsers)
- Files will be served over HTTP
- Working with legacy systems
When to Use Zstd¶
- Better compression is important
- Fast decompression is needed
- Processing large data volumes
- Internal/controlled environments
File Extensions¶
Follow conventions for file extensions:
| Format | Extension |
|---|---|
| Gzip | .gz |
| Zstd | .zst or .zstd |
Combine with format extensions:
data.json.gz- Gzip-compressed JSONlogs.ndjson.zst- Zstd-compressed NDJSON
Error Handling¶
gzipReader, err := gzip.NewReader(fileReader)
if err != nil {
// Invalid gzip header or corrupted data
return fmt.Errorf("failed to create gzip reader: %w", err)
}
Best Practices¶
- Close writers in reverse order - Or just close the outermost writer
- Use appropriate compression level - Balance speed vs size
- Follow naming conventions - Use
.gzor.zstextensions - Stream large files - Don't load entire files into memory