I want to store a collection of data in ArrayList or Hastable but data retrival should be efficient and fast. I want to know the data structure hides between ArrayList and Hastable (i.e Linked list,Double Linked list)
An ArrayList is a dynamic array that grows as new items are added that go beyond the current capacity of the list. Items in ArrayList are accessed by index, much like an array.
The Hashtable is a hashtable behind the scenes. The underlying data structure is typically an array but instead of accessing via an index, you access via a key field which maps to a location in the hashtable by calling the key object's GetHashCode() method.
In general, ArrayList and Hashtable are discouraged in .NET 2.0 and above in favor of List<T> and Dictionary<TKey, TValue> which are much better generic versions that perform better and don't have boxing costs for value types.
I've got a blog post that compares the various benefits of each of the generic containers here that may be useful:
http://geekswithblogs.net/BlackRabbitCoder/archive/2011/06/16/c.net-fundamentals-choosing-the-right-collection-class.aspx
While it talks about the generic collecitons in particular, ArrayList would have similar complexity costs to List<T> and Hashtable to Dictionary<TKey, TValue>
A hashtable will map string values to values in your hashtable. An arraylist puts a bunch of items in numbered order.
Hastable ht = new Hastable();
ht("examplenum") = 5;
ht("examplenum2") = 7;
//Then to retrieve
int i = ht("example"); //value of 5
ArrayList al = new ArrayList();
al.Add(2);
al.Add(3);
//Then to retrieve
int j = al[0] //value of 2
As its name implies an ArrayList (or a List) is implemented with an Array... and in fact a Hashtable is also implemented with the same data structure. So both of them have a constant access cost (the best of all possible).
What you have to think about is what kind of key do you need. If your data must be accessed with an arbitrary key (for example, a string) you will not be able to use an ArrayList. Also, a Hashtable should be your preferred choice if the keys are not (more or less) correlative.
Hope it helps.
Related
MATLAB tables let you index into any column/field using the row name, e.g., MyTable.FourthColumn('SecondRowName'). Compared to this, dictionaries (containers.Map) seem primitive, e.g., it serves the role of a 1-column table. It also has its own dedicated syntax, which slows down the thinking about how to code.
I'm beginning to think that I can forget the use of dictionaries. Are there typical situations for which that would not be advisable?
TL;DR: No. containers.Map has uses that cannot be replaced with a table. And I would not choose a table for a dictionary.
containers.Map and table have many differences worth noting. They each have their use. A third container we can use to create a dictionary is a struct.
To use a table as a dictionary, you'd define only one column, and specify row names:
T = table(data,'VariableNames',{'value'},'RowNames',names);
Here are some notable differences between these containers when used as a dictionary:
Speed: The struct has the fastest access by far (10x). containers.Map is about twice as fast as a table when used in an equivalent way (i.e. a single-column table with row names).
Keys: A struct is limited to keys that are valid variable names, the other two can use any string as a key. The containers.Map keys can be scalar numbers as well (floating-point or integer).
Data: They all can contain heterogeneous data (each value has a different type), but a table changes how you index if you do this (T.value(name) for homogeneous data, T.value{name} for heterogeneous data).
Syntax: To lookup the key, containers.Map provides the most straight-forward syntax: M(name). A table turned into a dictionary requires the pointless use of the column name: T.value(name). A struct, if the key is given by the contents of a variable, looks a little awkward: S.(name).
Construction: (See the code below.) containers.Map has the most straight-forward method for building a dictionary from given data. The struct is not meant for this purpose, and therefore it gets complicated.
Memory: This is hard to compare, as containers.Map is implemented in Java and therefore whos reports only 8 bytes (i.e. a pointer). A table can be more memory efficient than a struct, if the data is homogeneous (all values have the same type) and scalar, as in this case all values for one column are stored in a single array.
Other differences:
A table obviously can contain multiple columns, and has lots of interesting methods to manipulate data.
A stuct is actually a struct array, and can be indexed as S(i,j).(name). Of course name can be fixed, rather than a variable, leading to S(i,j).name. Of the three, this is the only built-in type, which is the reason it is so much more efficient.
Here is some code that shows the difference between these three containers for constructing a dictionary and looking up a value:
% Create names
names = cell(1,100);
for ii=1:numel(names)
names{ii} = char(randi(+'az',1,20));
end
name = names{1};
% Create data
values = rand(1,numel(names));
% Construct
M = containers.Map(names,values);
T = table(values.','VariableNames',{'value'},'RowNames',names);
S = num2cell(values);
S = [names;S];
S = struct(S{:});
% Lookup
M(name)
T.value(name)
S.(name)
% Timing lookup
timeit(#()M(name))
timeit(#()T.value(name))
timeit(#()S.(name))
Timing results (microseconds):
M: 16.672
T: 23.393
S: 2.609
You can go simpler, you can access structs using string field:
clear
% define
mydata.('vec')=[2 4 1];
mydata.num=12.58;
% get
select1='num';
value1=mydata.(select1); %method 1
select2='vec';
value2=getfield(mydata,select2) %method 2
Not sure if this is science fiction, but would it be possible to create a type that represents an Array that matches a certain condition, such as being always sorted?
Or a 2-tuple where the first element is always bigger than the second?
What you're describing is called a dependent type (https://en.wikipedia.org/wiki/Dependent_type). Swift does not have these, and I'm not aware of any mainstream (non-research) language that does. You can of course create a special kind of collection that is indexed like an array and sorts itself whenever it is modified, and you can crate a struct with greater and lessor properties that always reorders itself. But these criteria cannot be attached to the existing Array or tuple types.
I have a LinkedHashSet which was created from a Seq. I used a LinkedHashSet because I need to keep the order of the Seq, but also ensure uniqueness, like a Set. I need to check this LinkedHashSet against another sequence to verify that various properties within them are the same. I assumed that I could loop through using an index, i, but it appears not. Here is an example of what I would like to accomplish.
var s: Seq[Int] = { 1 to mySeq.size }
return s.forall { i =>
myLHS.indexOf(i).something == mySeq.indexOf(i).something &&
myLHS.indexOf(i).somethingelse == mySeq.indexOf(i).somethingelse
}
So how do I access individual elements of the LHS?
Consider using the zip method on collections to create a collection of pairs (Tuples). The specifics of this depend on your specifics. You may want to do mySeq.zip(myLHS) or myLHS.zip(mySeq), which will create different structures. You probably want mySeq.zip(myLHS), but I'm guessing. Also, if the collections are very large, you may want to take a view first, e.g. mySeq.view.zip(myLHS) so that the pair collection is also non-strict.
Once you have this combined collection, you can use a for-comprehension (or directly, myZip.foreach) to traverse it.
A LinkedHashSet is not necessary in this situation. Since I made it from a Seq, it is already ordered. I do not have to convert it to a LHS in order to also make it unique. Apparently, Seq has the distinct method which will remove duplicates from the sequence. From there, I can access the items via their indexes.
Is there a way the define a Set data-structure in PowerShell?
In computer science, a set is an abstract data type that can store certain values, without any particular order, and no repeated values. It is a computer implementation of the mathematical concept of a finite set. Unlike most other collection types, rather than retrieving a specific element from a set, one typically tests a value for membership in a set.
I need to use a data structure as keystore that:
assures no-repetitions;
minimizes the computational effort in order to retrieve and remove an element.
You can use the .NET HashSet class that is found under System.Collections.Generic:
$set = New-Object System.Collections.Generic.HashSet[int]
The collection guarantees unique items and the Add, Remove, and Contains methods all operate with O(1) complexity on average.
If you prefer to stick with native PowerShell types, you can use HashTable and just ignore the key values:
# Initialize the set
$set = #{}
# Add an item
$set.Add("foo", $true)
# Or, if you prefer add/update semantics
$set["foo"] = $true
# Check if item exists
if ($set.Contains("foo"))
{
echo "exists"
}
# Remove item
$set.Remove("foo")
For more information see: https://powershellexplained.com/2016-11-06-powershell-hashtable-everything-you-wanted-to-know-about/#removing-and-clearing-keys
Hashset is what you are looking for if you want to store only unique values in an array with relatively faster add, remove and find operations. It can be created as -
$set = [System.Collections.Generic.HashSet[int]]#()
Am just mulling over what's the best way i.e. data structure to store a data that has several rows and columns. Shoudl I store it as :
1. an array of arrays?
2. NSDictionary?
or is there any grid-like data structure in iOS where I can easily fetch any row/column with ease from the data structure? For example, I must be able to fetch the value in 3rd column in row 5. Currently, say, I store each row as an array and the store these arrays in another array (so an array of arrays, say), then to fetch the value in column 3 in row 5, I need to fetch the 5th row in the array of arrays, and then in the resulting array, I need to fetch the 3rd object. Is there a better way to do this? Thoughts please?
then to fetch the value in column 3 in row 5, I need to fetch the 5th
row in the array of arrays, and then in the resulting array, I need to
fetch the 3rd object. Is there a better way to do this?
An array of arrays is fine for the implementation, and the collection subscripting that was recently added to Objective-C makes this easier -- you can use an expression like
NSString *s = myData[m][n];
to get the string at the nth column of the mth row.
That said, it may still be a good idea to create a separate class for your data structure, so that the rest of your code is protected from needing to know about how the data is stored. That would also simplify the process of changing the implementation from, say, an array of arrays to a SQLite table or something else.
Your data storage class doesn't need to be fancy or complicated. Here's a first pass:
#interface DataTable
- (id)objectAtRow:(NSInteger)row column:(NSInteger)column;
- (void)setObjectAtRow:(NSInteger)row column:(NSInteger)column;
#end
I'm sure you can see how to implement those in terms of an array of arrays. You'll have to do a little work to add rows and/or columns when the caller tries to set a value outside the current bounds. You might also want to add support for things like fast enumeration and writing to and reading from property lists, but that can come later.
There are other ways of doing it, but there's nothing wrong with the method you are using. You could use an NSDictionary with a key of type NSIndexPath, for example, or even a string key of the form "row,col", but I don't see any advantage in those except for sparse matrices.
You can either use an array of arrays, as you're doing, or an array of dictionaries. Either is fine, and I don't think there's any preference for one over the other. It all depends on which way is most convenient for you to set up the data structure in the first place. Accessing the data for the table view is equally easy using either method.