go-git - A highly extensible Git implementation in pure Go.

	Commit message (Collapse)	Author	Age	Files	Lines
*	plumbing: Optimise memory consumption for filesystem storage	Paulo Gomes	2023-10-28	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, as part of building the index representation, the resolveObject func would create an interim plumbing.MemoryObject, which would then be saved into storage via storage.SetEncodedObject. This meant that objects would be unnecessarily loaded into memory, to then be saved into disk. The changes streamlines this process by: - Introducing the LazyObjectWriter interface which enables the write operation to take places directly against the filesystem-based storage. - Leverage multi-writers to process the input data once, while targeting multiple writers (e.g. hasher and storage). An additional change relates to the caching of object info children within Parser.get. The cache is now skipped when a seekable filesystem is being used. The impact of the changes can be observed when using seekable filesystem storages, especially when cloning large repositories. The stats below were captured by adapting the BenchmarkPlainClone test to clone https://github.com/torvalds/linux.git: pkg: github.com/go-git/go-git/v5 cpu: Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz │ /tmp/old │ /tmp/new │ │ sec/op │ sec/op vs base │ PlainClone-16 41.68 ± 17% 48.04 ± 9% +15.27% (p=0.015 n=6) │ /tmp/old │ /tmp/new │ │ B/op │ B/op vs base │ PlainClone-16 1127.8Mi ± 7% 256.7Mi ± 50% -77.23% (p=0.002 n=6) │ /tmp/old │ /tmp/new │ │ allocs/op │ allocs/op vs base │ PlainClone-16 3.125M ± 0% 3.800M ± 0% +21.60% (p=0.002 n=6) Notice that on average the memory consumption per operation is over 75% smaller. The time per operation increased by 15%, which may actual be less on long running applications, due to the decreased GC pressure and the garbage collection costs. Signed-off-by: Paulo Gomes <pjbgf@linux.com>
*	storage: filesystem, Populate index before use. Fixes #148	Arieh Schneier	2023-05-04	1	-0/+10
\| \| \| \|	Signed-off-by: Arieh Schneier <15041913+AriehSchneier@users.noreply.github.com>
*	Use Sync.Pool pointers to optimise memory usage	Paulo Gomes	2022-11-07	1	-1/+13
\| \| \| \|	Signed-off-by: Paulo Gomes <pjbgf@linux.com>
*	plumbing: format/packfile, prevent large objects from being read into memory ↵	zeripath	2021-06-30	1	-6/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	completely (#330) This PR adds code to prevent large objects from being read into memory from packfiles or the filesystem. Objects greater than 1Mb are now no longer directly stored in the cache or read completely into memory. This PR differs and improves the previous broken #323 by fixing several bugs in the reader and transparently wrapping ReaderAt as a Reader. Signed-off-by: Andrew Thornton <art27@cantab.net>
*	Revert "plumbing: format/packfile, prevent large objects from being read ↵v5.4.2	zeripath	2021-06-02	1	-8/+1
\| \| \| \| \|	into memory completely (#303)" (#329) This reverts commit 720c192831a890d0a36b4c6720b60411fa4a0159.
*	plumbing: format/packfile, prevent large objects from being read into memory ↵v5.4.0	zeripath	2021-05-12	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \|	completely (#303) This PR adds code to prevent large objects from being read into memory from packfiles or the filesystem. Objects greater than 1Mb are now no longer directly stored in the cache or read completely into memory. Signed-off-by: Andrew Thornton <art27@cantab.net>
*	Support partial hashes in Repository.ResolveRevision.	David Symonds	2020-07-16	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \|	Like `git rev-parse <prefix>`, this enumerates the hashes of objects with the given prefix and adds them to the list of candidates for resolution. This has an exhaustive slow path, which requires enumerating all objects and filtering each one, but also a couple of fast paths for common cases. There's room for future work to make this faster; TODOs have been left for that. Fixes #135.
*	Close Reader & Writer of EncodedObject after use	Kyungmin Bae	2020-05-24	1	-0/+2
\|
*	*: migration from gopkg to go modules	Máximo Cuadros	2020-03-10	1	-10/+10
\|
*	filesystem: ObjectStorage, MaxOpenDescriptors option	Arran Walker	2019-04-22	1	-39/+106
\| \| \| \| \| \| \| \|	The MaxOpenDescriptors option provides a middle ground solution between keeping all packfiles open (as offered by the KeepDescriptors option) and keeping none open. Signed-off-by: Arran Walker <arran.walker@fiveturns.org>
*	storage/filesystem: check file object before using cache	Javi Fontan	2019-01-30	1	-5/+4
\| \| \| \| \| \| \| \| \| \|	If the cache is shared between several repositories getFromUnpacked can erroneously return an object from other repository. This decreases performance a little bit as there's an extra fs operation when the object is in the cache but is correct when the cache is shared. Signed-off-by: Javi Fontan <jfontan@gmail.com>
*	plumbing: format/packfile, performance optimizations for reading large ↵	Filip Navara	2018-11-28	1	-23/+36
\| \| \| \| \| \|	commit histories (#963) Signed-off-by: Filip Navara <navara@emclient.com>
*	storage/filesystem: Added reindex method to reindex packfiles	Javier Peletier	2018-11-12	1	-0/+5
\| \| \| \|	Signed-off-by: Javier Peletier <jm@epiclabs.io>
*	filesystem: add a new test for EncodedObjectSize	Jeremy Stribling	2018-10-12	1	-3/+1
\| \| \| \| \| \|	Suggested by taruti. Signed-off-by: Jeremy Stribling <strib@alum.mit.edu>
*	object: get object size without reading whole object	Jeremy Stribling	2018-10-11	1	-0/+75
\| \| \| \|	Signed-off-by: Jeremy Stribling <strib@alum.mit.edu>
*	storage/filesystem: add more doc to NewPackfileIter	Javi Fontan	2018-09-21	1	-4/+7
\| \| \| \|	Signed-off-by: Javi Fontan <jfontan@gmail.com>
*	storage/filesystem: keep packs open in PackfileIter	Javi Fontan	2018-09-20	1	-10/+23
\| \| \| \| \| \| \| \|	PackfileIter was not taking into account the option KeepDescriptors and was always closing the file. This caused "file already closed" errors when iterating packfiles in with KeepDescriptors active. Signed-off-by: Javi Fontan <jfontan@gmail.com>
*	Expose Storage cache.	kuba--	2018-09-07	1	-19/+10
\| \| \| \|	Signed-off-by: kuba-- <kuba@sourced.tech>
*	storage/dotgit: add KeepDescriptors option	Javi Fontan	2018-09-04	1	-1/+9
\| \| \| \| \| \| \| \| \| \|	This option maintains packfile file descriptors opened after reading objects from them. It improves performance as it does not have to be opening packfiles each time an object is needed. Also adds Close to EncodedObjectStorer to close all the files manualy. Signed-off-by: Javi Fontan <jfontan@gmail.com>
*	storage/filesystem: move Options to filesytem and dotgit	Javi Fontan	2018-09-03	1	-0/+12
\| \| \| \|	Signed-off-by: Javi Fontan <jfontan@gmail.com>
*	plumbing, storage: add bases to the common cache	Javi Fontan	2018-08-22	1	-2/+16
\| \| \| \| \| \| \| \| \| \| \| \|	After clone only resolved deltas were added to the cache. This caused slowdowns in small repositories where most objects can be held in cache. It also makes packfiles reuse delta cache from the store. Previously it created a new delta cache each time a packfile object was created. This also slowed down a bit accessing objects and had an impact on memory consumption when bases are added to the cache. Signed-off-by: Javi Fontan <jfontan@gmail.com>
*	plumbing: packfile, open and close packfile on FSObject reads	Miguel Molina	2018-08-09	1	-9/+6
\| \| \| \|	Signed-off-by: Miguel Molina <miguel@erizocosmi.co>
*	storage: filesystem, close Packfile after iterating objects	Miguel Molina	2018-08-09	1	-1/+10
\| \| \| \|	Signed-off-by: Miguel Molina <miguel@erizocosmi.co>
*	storage: filesystem, benchmark PackfileIter	Miguel Molina	2018-08-09	1	-4/+26
\| \| \| \|	Signed-off-by: Miguel Molina <miguel@erizocosmi.co>
*	*: use parser to populate non writable storages and bug fixes	Miguel Molina	2018-08-07	1	-47/+30
\| \| \| \|	Signed-off-by: Miguel Molina <miguel@erizocosmi.co>
*	plumbing, storage: integrate new index	Javi Fontan	2018-07-26	1	-17/+29
\| \| \| \| \| \|	Now dotgit.PackWriter uses the new packfile.Parser and index. Signed-off-by: Javi Fontan <jfontan@gmail.com>
*	plumbing/format/idxfile: add new Index and MemoryIndex	Miguel Molina	2018-07-19	1	-1/+1
\| \| \| \|	Signed-off-by: Miguel Molina <miguel@erizocosmi.co>
*	storage: filesystem, make ObjectStorage constructor public	Miguel Molina	2018-06-08	1	-2/+3
\| \| \| \|	Signed-off-by: Miguel Molina <miguel@erizocosmi.co>
*	dotgit: Move package outside internal.	Antonio Jesus Navarro Perez	2018-06-05	1	-1/+1
\| \| \| \|	Signed-off-by: Antonio Jesus Navarro Perez <antnavper@gmail.com>
*	*: Use CheckClose with named returns	Javi Fontan	2018-03-27	1	-4/+4
\| \| \| \| \| \| \| \|	Previously some close errors were losts. This is specially problematic in go-git as lots of work is done here like generating indexes and moving packfiles. Signed-off-by: Javi Fontan <jfontan@gmail.com>
*	storage/filesystem: optimize packfile iterator	Denys Smirnov	2018-03-03	1	-22/+61
\| \| \| \| \| \| \| \|	* do not store extra bool values in the seen map * open packfile iterators lazily Signed-off-by: Denys Smirnov <denys@sourced.tech>
*	Make DeltaBaseCache private	Javi Fontan	2017-12-20	1	-6/+6
\| \| \| \|	Signed-off-by: Javi Fontan <jfontan@gmail.com>
*	Enforce the use of cache in packfile decoder	Javi Fontan	2017-12-20	1	-5/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Decoder object can make use of an object cache to speed up processing. Previously the only way to specify it was changing manually the struct generated by NewDecodeForFile. This lead to some instances to be created without it and penalized performance. Now the cache should be explicitly passed to the constructor function. NewDecoder now creates objects with a cache using the default size. A new helper function was added to create cache objects with the default size as this becomes a common task now: cache.NewObjectLRUDefault() Signed-off-by: Javi Fontan <jfontan@gmail.com>
*	storage: filesystem, add support for git alternates (#663)	Sunny	2017-12-06	1	-0/+21
\| \| \| \|	This change adds a new method Alternates() in DotGit to check and query alternate source.
*	storage: some minor code cleanup	Jeremy Stribling	2017-11-29	1	-6/+3
\| \| \| \| \| \|	Suggested by mcuadros. Issue: #669
*	plumbing: add `HasEncodedObject` method to Storer	Jeremy Stribling	2017-11-29	1	-0/+26
\| \| \| \| \| \| \|	This allows the user to check whether an object exists, without reading all the object data from storage. Issue: KBFS-2445
*	Make object repacking more configurable	Taru Karttunen	2017-11-29	1	-2/+2
\|
*	Support for repacking objects	Taru Karttunen	2017-11-29	1	-0/+8
\|
*	First pass of prune design	Taru Karttunen	2017-11-29	1	-0/+24
\|
*	all: simplification	ferhat elmas	2017-11-29	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	- no length for map initialization - don't check for boolean/error return - don't format string - use string method of bytes buffer instead of converting bytes to string - use `strings.Contains` instead of `strings.Index` - use `bytes.Equal` instead of `bytes.Compare`
*	update to go-billy.v4 and go-git-fixtures.v3	Máximo Cuadros	2017-11-23	1	-1/+1
\| \| \| \|	Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>
*	Merge pull request #515 from smola/reuse-packed-objects	Máximo Cuadros	2017-07-27	1	-7/+97
\|\ \| \| \| \|	storage: reuse deltas from packfiles
\| *	storage: reuse deltas from packfiles	Santiago M. Mola	2017-07-27	1	-7/+97
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* plumbing: add DeltaObject interface for EncodedObjects that are deltas and hold additional information about them, such as the hash of the base object. * plumbing/storer: add DeltaObjectStorer interface for object storers that can return DeltaObject. Note that calls to EncodedObject will never return instances of DeltaObject. That requires explicit calls to DeltaObject. * storage/filesystem: implement DeltaObjectStorer interface. * plumbing/packfile: packfile encoder now supports reusing deltas that are already computed (e.g. from an existing packfile) if the storage implements DeltaObjectStorer. Reusing deltas boosts performance of packfile generation (e.g. on push).
* \|	filesystem: reuse cache for packfile iterator	Santiago M. Mola	2017-07-27	1	-3/+4
\|/
*	plumbing/cache: change FIFO to LRU cache	Santiago M. Mola	2017-07-27	1	-1/+1
\|
*	storage/filesystem: reuse delta cache	Santiago M. Mola	2017-07-27	1	-1/+9
\| \| \| \| \|	Reuse delta base object cache for packfile decoders across multiple instances.
*	packfile: create packfile.Index and reuse it	Santiago M. Mola	2017-07-26	1	-33/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There was an internal type (i.e. storage/filesystem.idx) to use as in-memory index for packfiles. This was not convenient to reuse in the packfile. This commit creates a new representation (format/packfile.Index) that can be converted to and from idxfile.Idxfile. A packfile.Index now contains the functionality that was scattered on storage/filesystem.idx and packfile.Decoder's internals. storage/filesystem now reuses packfile.Index instances and this also results in higher cache hit ratios when resolving deltas.
*	storage/filesystem: check all Close errors	Santiago M. Mola	2017-07-19	1	-9/+12
\|
*	*: upgrade to go-billy.v3, merge	Máximo Cuadros	2017-06-18	1	-1/+1
\|
*	Lazily load object index.	JP Sugarbroad	2017-04-06	1	-6/+22
\| \| \| \|	fixes #327