diff options
author | Alberto Cortés <alcortesm@gmail.com> | 2016-07-04 17:09:22 +0200 |
---|---|---|
committer | Máximo Cuadros <mcuadros@gmail.com> | 2016-07-04 17:09:22 +0200 |
commit | 5e73f01cb2e027a8f02801635b79d3a9bc866914 (patch) | |
tree | c0e7eb355c9b8633d99bab9295cb72b6c3a9c0e1 /formats/packfile/decoder.go | |
parent | 808076af869550a200a3a544c9ee2fa22a8b6a85 (diff) | |
download | go-git-5e73f01cb2e027a8f02801635b79d3a9bc866914.tar.gz |
Adds support to open local repositories and to use file-based object storage (#55)v3.1.0
* remove some comments
* idx writer/reader
* Shut up ssh tests, they are annoying
* Add file scheme test to clients
* Add dummy file client
* Add test fot file client
* Make tests use fixture endpoint
* add parser for packed-refs format
* add parser for packed-refs format
* WIP adding dir.Refs() tests
* Add test for fixture refs
* refs parser for the refs directory
* Documentation
* Add Capabilities to file client
* tgz.Exatract now accpets a path instead of a Reader
* fix bug in idxfile fanout calculation
* remove dead code
* packfile documentation
* clean packfile parser code
* add core.Object.Content() and returns errors for core.ObjectStorage.Iter()
* add seekable storage
* add dir repos to NewRepository
* clean prints
* Add dir client documentation to README
* Organize the README
* README
* Clean tgz package
* Clean temp dirs after tgz tests
* Gometalinter on gitdir
* Clean pattern function
* metalinter tgz
* metalinter gitdir
* gitdir coverage and remove seekable packfile filedescriptor leak
* gitdir Idxfile tests and remove file descriptor leak
* gitdir Idxfile tests when no idx is found
* clean storage/seekable/internal/index and some formats/idxfile API issues
* clean storage/seekable
* clean formats/idx
* turn packfile/doc.go into packfile/doc.txt
* move formats/packfile/reader to decoder
* fix packfile decoder error names
* improve documentation
* comment packfile decoder errors
* comment public API (format/packfile)
* remve duplicated code in packfile decoder test
* move tracking_reader into an internal package and clean it
* use iota for packfile format
* rename packfile parse.go to packfile object_at.go
* clean packfile deltas
* fix delta header size bug
* improve delta documentation
* clean packfile deltas
* clean packfiles deltas
* clean repository.go
* Remove go 1.5 from Travis CI
Because go 1.5 does not suport internal packages.
* change local repo scheme to local://
* change "local://" to "file://" as the local scheme
* fix broken indentation
* shortens names of variables in short scopes
* more shortening of variable names
* more shortening of variable names
* Rename git dir client to "file", as the scheme used for it
* Fix file format ctor name, now that the package name has change
* Sortcut local repo constructor to not use remotes
The object storage is build directly in the repository ctor, instead
of creating a remote and waiting for the user to pull it.
* update README and fix some errors in it
* remove file scheme client
* Local respositories has now a new ctor
This is, they are no longer identified by the scheme of the URL, but are
created different from inception.
* remove unused URL field form Repository
* move all git dir logic to seekable sotrage ctor
* fix documentation
* Make formats/file/dir an internal package to storage/seekable
* change package storage/seekable to storage/fs
* clean storage/fs
* overall storage/fs clean
* more cleaning
* some metalinter fixes
* upgrade cshared to last changes
* remove dead code
* fix test error info
* remove file scheme check from clients
* fix test error message
* fix test error message
* fix error messages
* style changes
* fix comments everywhere
* style changes
* style changes
* scaffolding and tests for local packfiles without ifx files
* outsource index building from packfile to the packfile decoder
* refactor packfile header reading into a new function
* move code to generate index from packfile back to index package
* add header parsing
* fix documentation errata
* add undeltified and OFS delta support for index building from the packfile
* add tests for packfile with ref-deltas
* support for packfiles with ref-deltas and no idx
* refactor packfile format parser to reuse code
* refactor packfile format parser to reuse code
* refactor packfile format parser to reuse code
* refactor packfile format parser to reuse code
* refactor packfile format parser to reuse code
* WIP refactor packfile format parser to reuse code
* refactor packfile format parser to reuse code
* remove prints from tests
* remove prints from tests
* refactor packfile.core into packfile.parser
* rename packfile reader to something that shows it is a recaller
* rename cannot recall error
* rename packfile.Reader to packfile.ReadRecaller and document
* speed up test by using StreamReader instead of SeekableReader when possible
* clean packfile StreamReader
* stream_reader tests
* refactor packfile.StreamReader into packfile.StreamReadRecaller
* refactor packfile.SeekableReader into packfile.SeekableReadRecaller and document it
* generalize packfile.StreamReadRecaller test to all packfile.ReadRecaller implementations
* speed up storage/fs tests
* speed up tests in . by loading packfiles in memory
* speed up repository tests by using and smaller fixture
* restore doc.go files
* rename packfile.ReadRecaller implementations to shorter names
* update comments to type changes
* packfile.Parser test (WIP)
* packfile.Parser tests and add ForgetAll() to packfile.ReadRecaller
* add test for packfile.ReadRecaller.ForgetAll()
* clarify seekable being able to recallByOffset forgetted objects
* use better names for internal maps
* metalinter packfile package
* speed up some tests
* documentation fixes
* change storage.fs package name to storage.proxy to avoid confusion with new filesystem support
* New fs package and os transparent implementation
Now NewRepositoryFromFS receives a fs and a path and tests are
modified accordingly, but it is still not using for anything.
* add fs to gitdir and proxy.store
* reduce fs interface for easier implementation
* remove garbage dirs from tgz tests
* change file name gitdir/dir.go to gitdir/gitdir.go
* fs.OS tests
* metalinter utils/fs
* add NewRepositoryFromFS documentation to README
* Readability fixes to README
* move tgz to an external dependency
* move filesystem impl. example to example dir
* rename proxy/store.go to proxy/storage.go for coherence with memory/storage.go
* rename proxy package to seekable
Diffstat (limited to 'formats/packfile/decoder.go')
-rw-r--r-- | formats/packfile/decoder.go | 116 |
1 files changed, 116 insertions, 0 deletions
diff --git a/formats/packfile/decoder.go b/formats/packfile/decoder.go new file mode 100644 index 0000000..e8c5c6a --- /dev/null +++ b/formats/packfile/decoder.go @@ -0,0 +1,116 @@ +package packfile + +import ( + "io" + + "gopkg.in/src-d/go-git.v3/core" +) + +// Format specifies if the packfile uses ref-deltas or ofs-deltas. +type Format int + +// Possible values of the Format type. +const ( + UnknownFormat Format = iota + OFSDeltaFormat + REFDeltaFormat +) + +var ( + // ErrMaxObjectsLimitReached is returned by Decode when the number + // of objects in the packfile is higher than + // Decoder.MaxObjectsLimit. + ErrMaxObjectsLimitReached = NewError("max. objects limit reached") + + // ErrInvalidObject is returned by Decode when an invalid object is + // found in the packfile. + ErrInvalidObject = NewError("invalid git object") + + // ErrPackEntryNotFound is returned by Decode when a reference in + // the packfile references and unknown object. + ErrPackEntryNotFound = NewError("can't find a pack entry") + + // ErrZLib is returned by Decode when there was an error unzipping + // the packfile contents. + ErrZLib = NewError("zlib reading error") +) + +const ( + // DefaultMaxObjectsLimit is the maximum amount of objects the + // decoder will decode before returning ErrMaxObjectsLimitReached. + DefaultMaxObjectsLimit = 1 << 20 +) + +// Decoder reads and decodes packfiles from an input stream. +type Decoder struct { + // MaxObjectsLimit is the limit of objects to be load in the packfile, if + // a packfile excess this number an error is throw, the default value + // is defined by DefaultMaxObjectsLimit, usually the default limit is more + // than enough to work with any repository, with higher values and huge + // repositories you can run out of memory. + MaxObjectsLimit uint32 + + p *Parser + s core.ObjectStorage +} + +// NewDecoder returns a new Decoder that reads from r. +func NewDecoder(r ReadRecaller) *Decoder { + return &Decoder{ + MaxObjectsLimit: DefaultMaxObjectsLimit, + + p: NewParser(r), + } +} + +// Decode reads a packfile and stores it in the value pointed to by s. +func (d *Decoder) Decode(s core.ObjectStorage) error { + d.s = s + + count, err := d.p.ReadHeader() + if err != nil { + return err + } + + if count > d.MaxObjectsLimit { + return ErrMaxObjectsLimitReached.AddDetails("%d", count) + } + + err = d.readObjects(count) + + return err +} + +func (d *Decoder) readObjects(count uint32) error { + // This code has 50-80 µs of overhead per object not counting zlib inflation. + // Together with zlib inflation, it's 400-410 µs for small objects. + // That's 1 sec for ~2450 objects, ~4.20 MB, or ~250 ms per MB, + // of which 12-20 % is _not_ zlib inflation (ie. is our code). + for i := 0; i < int(count); i++ { + start, err := d.p.Offset() + if err != nil { + return err + } + + obj, err := d.p.ReadObject() + if err != nil { + if err == io.EOF { + break + } + + return err + } + + err = d.p.Remember(start, obj) + if err != nil { + return err + } + + _, err = d.s.Set(obj) + if err == io.EOF { + break + } + } + + return nil +} |