aboutsummaryrefslogtreecommitdiffstats
path: root/odict.py
blob: 3bf708a7ec04c15bf9394ab9f6a4b1eb5989c698 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
# -*- coding: utf-8 -*-
"""
    odict
    ~~~~~

    This module is an example implementation of an ordered dict for the
    collections module.  It's not written for performance (it actually
    performs pretty bad) but to show how the API works.


    Questions and Answers
    =====================

    Why would anyone need ordered dicts?

        Dicts in python are unordered which means that the order of items when
        iterating over dicts is undefined.  As a matter of fact it is most of
        the time useless and differs from implementation to implementation.

        Many developers stumble upon that problem sooner or later when
        comparing the output of doctests which often does not match the order
        the developer thought it would.

        Also XML systems such as Genshi have their problems with unordered
        dicts as the input and output ordering of tag attributes is often
        mixed up because the ordering is lost when converting the data into
        a dict.  Switching to lists is often not possible because the
        complexity of a lookup is too high.

        Another very common case is metaprogramming.  The default namespace
        of a class in python is a dict.  With Python 3 it becomes possible
        to replace it with a different object which could be an ordered dict.
        Django is already doing something similar with a hack that assigns
        numbers to some descriptors initialized in the class body of a
        specific subclass to restore the ordering after class creation.

        When porting code from programming languages such as PHP and Ruby
        where the item-order in a dict is guaranteed it's also a great help
        to have an equivalent data structure in Python to ease the transition.

    Where are new keys added?

        At the end.  This behavior is consistent with Ruby 1.9 Hashmaps
        and PHP Arrays.  It also matches what common ordered dict
        implementations do currently.

    What happens if an existing key is reassigned?

        The key is *not* moved.  This is consitent with existing
        implementations and can be changed by a subclass very easily::

            class movingodict(odict):
                def __setitem__(self, key, value):
                    self.pop(key, None)
                    odict.__setitem__(self, key, value)

        Moving keys to the end of a ordered dict on reassignment is not
        very useful for most applications.

    Does it mean the dict keys are sorted by a sort expression?

        That's not the case.  The odict only guarantees that there is an order
        and that newly inserted keys are inserted at the end of the dict.  If
        you want to sort it you can do so, but newly added keys are again added
        at the end of the dict.

    I initializes the odict with a dict literal but the keys are not
    ordered like they should!

        Dict literals in Python generate dict objects and as such the order of
        their items is not guaranteed.  Before they are passed to the odict
        constructor they are already unordered.

    What happens if keys appear multiple times in the list passed to the
    constructor?

        The same as for the dict.  The latter item overrides the former.  This
        has the side-effect that the position of the first key is used because
        the key is actually overwritten:

        >>> odict([('a', 1), ('b', 2), ('a', 3)])
        odict.odict([('a', 3), ('b', 2)])

        This behavor is consistent with existing implementation in Python
        and the PHP array and the hashmap in Ruby 1.9.

    This odict doesn't scale!

        Yes it doesn't.  The delitem operation is O(n).  This is file is a
        mockup of a real odict that could be implemented for collections
        based on an linked list.

    Why is there no .insert()?

        There are few situations where you really want to insert a key at
        an specified index.  To now make the API too complex the proposed
        solution for this situation is creating a list of items, manipulating
        that and converting it back into an odict:

        >>> d = odict([('a', 42), ('b', 23), ('c', 19)])
        >>> l = d.items()
        >>> l.insert(1, ('x', 0))
        >>> odict(l)
        odict.odict([('a', 42), ('x', 0), ('b', 23), ('c', 19)])

    :copyright: (c) 2008 by Armin Ronacher and PEP 273 authors.
    :license: modified BSD license.
"""
from itertools import izip, imap
from copy import deepcopy

missing = object()


class odict(dict):
    """
    Ordered dict example implementation.

    This is the proposed interface for a an ordered dict as proposed on the
    Python mailinglist (proposal_).

    It's a dict subclass and provides some list functions.  The implementation
    of this class is inspired by the implementation of Babel but incorporates
    some ideas from the `ordereddict`_ and Django's ordered dict.

    The constructor and `update()` both accept iterables of tuples as well as
    mappings:

    >>> d = odict([('a', 'b'), ('c', 'd')])
    >>> d.update({'foo': 'bar'})
    >>> d
    odict.odict([('a', 'b'), ('c', 'd'), ('foo', 'bar')])
    
    Keep in mind that when updating from dict-literals the order is not
    preserved as these dicts are unsorted!

    You can copy an odict like a dict by using the constructor, `copy.copy`
    or the `copy` method and make deep copies with `copy.deepcopy`:

    >>> from copy import copy, deepcopy
    >>> copy(d)
    odict.odict([('a', 'b'), ('c', 'd'), ('foo', 'bar')])
    >>> d.copy()
    odict.odict([('a', 'b'), ('c', 'd'), ('foo', 'bar')])
    >>> odict(d)
    odict.odict([('a', 'b'), ('c', 'd'), ('foo', 'bar')])
    >>> d['spam'] = []
    >>> d2 = deepcopy(d)
    >>> d2['spam'].append('eggs')
    >>> d
    odict.odict([('a', 'b'), ('c', 'd'), ('foo', 'bar'), ('spam', [])])
    >>> d2
    odict.odict([('a', 'b'), ('c', 'd'), ('foo', 'bar'), ('spam', ['eggs'])])

    All iteration methods as well as `keys`, `values` and `items` return
    the values ordered by the the time the key-value pair is inserted:

    >>> d.keys()
    ['a', 'c', 'foo', 'spam']
    >>> d.values()
    ['b', 'd', 'bar', []]
    >>> d.items()
    [('a', 'b'), ('c', 'd'), ('foo', 'bar'), ('spam', [])]
    >>> list(d.iterkeys())
    ['a', 'c', 'foo', 'spam']
    >>> list(d.itervalues())
    ['b', 'd', 'bar', []]
    >>> list(d.iteritems())
    [('a', 'b'), ('c', 'd'), ('foo', 'bar'), ('spam', [])]

    Index based lookup is supported too by `byindex` which returns the
    key/value pair for an index:

    >>> d.byindex(2)
    ('foo', 'bar')

    You can reverse the odict as well:

    >>> d.reverse()
    >>> d
    odict.odict([('spam', []), ('foo', 'bar'), ('c', 'd'), ('a', 'b')])
    
    And sort it like a list:

    >>> d.sort(key=lambda x: x[0].lower())
    >>> d
    odict.odict([('a', 'b'), ('c', 'd'), ('foo', 'bar'), ('spam', [])])

    .. _proposal: http://thread.gmane.org/gmane.comp.python.devel/95316
    .. _ordereddict: http://www.xs4all.nl/~anthon/Python/ordereddict/
    """

    def __init__(self, *args, **kwargs):
        dict.__init__(self)
        self._keys = []
        self.update(*args, **kwargs)

    def __delitem__(self, key):
        dict.__delitem__(self, key)
        self._keys.remove(key)

    def __setitem__(self, key, item):
        if key not in self:
            self._keys.append(key)
        dict.__setitem__(self, key, item)

    def __deepcopy__(self, memo=None):
        if memo is None:
            memo = {}
        d = memo.get(id(self), missing)
        if d is not missing:
            return d
        memo[id(self)] = d = self.__class__()
        dict.__init__(d, deepcopy(self.items(), memo))
        d._keys = self._keys[:]
        return d

    def __getstate__(self):
        return {'items': dict(self), 'keys': self._keys}

    def __setstate__(self, d):
        self._keys = d['keys']
        dict.update(d['items'])

    def __reversed__(self):
        return reversed(self._keys)

    def __eq__(self, other):
        if isinstance(other, odict):
            if not dict.__eq__(self, other):
                return False
            return self.items() == other.items()
        return dict.__eq__(self, other)

    def __ne__(self, other):
        return not self.__eq__(other)

    def __cmp__(self, other):
        if isinstance(other, odict):
            return cmp(self.items(), other.items())
        elif isinstance(other, dict):
            return dict.__cmp__(self, other)
        return NotImplemented

    @classmethod
    def fromkeys(cls, iterable, default=None):
        return cls((key, default) for key in iterable)

    def clear(self):
        del self._keys[:]
        dict.clear(self)

    def copy(self):
        return self.__class__(self)

    def items(self):
        return zip(self._keys, self.values())

    def iteritems(self):
        return izip(self._keys, self.itervalues())

    def keys(self):
        return self._keys[:]

    def iterkeys(self):
        return iter(self._keys)

    def pop(self, key, default=missing):
        if default is missing:
            return dict.pop(self, key)
        elif key not in self:
            return default
        self._keys.remove(key)
        return dict.pop(self, key, default)

    def popitem(self, key):
        self._keys.remove(key)
        return dict.popitem(key)

    def setdefault(self, key, default=None):
        if key not in self:
            self._keys.append(key)
        dict.setdefault(self, key, default)

    def update(self, *args, **kwargs):
        sources = []
        if len(args) == 1:
            if hasattr(args[0], 'iteritems'):
                sources.append(args[0].iteritems())
            else:
                sources.append(iter(args[0]))
        elif args:
            raise TypeError('expected at most one positional argument')
        if kwargs:
            sources.append(kwargs.iteritems())
        for iterable in sources:
            for key, val in iterable:
                self[key] = val

    def values(self):
        return map(self.get, self._keys)

    def itervalues(self):
        return imap(self.get, self._keys)

    def index(self, item):
        return self._keys.index(item)

    def byindex(self, item):
        key = self._keys[item]
        return (key, dict.__getitem__(self, key))

    def reverse(self):
        self._keys.reverse()

    def sort(self, *args, **kwargs):
        self._keys.sort(*args, **kwargs)

    def __repr__(self):
        return 'odict.odict(%r)' % self.items()

    __copy__ = copy
    __iter__ = iterkeys


if __name__ == '__main__':
    import doctest
    doctest.testmod()