Data Processing, how to approach - scala

I have the following Problem, given this XML Datastructure:
<level1>
<level2ElementTypeA></level2ElementTypeA>
<level2ElementTypeB>
<level3ElementTypeA>String1Ineed<level3ElementTypeB>
</level2ElementTypeB>
...
<level2ElementTypeC>
<level3ElementTypeB attribute1>
<level4ElementTypeA>String2Ineed<level4ElementTypeA>
<level3ElementTypeB>
<level2ElementTypeC>
...
<level2ElementTypeD></level2ElementTypeD>
</level1>
<level1>...</level1>
I need to create an Entity which contain: String1Ineed and String2Ineed.
So every time I came across a level3ElementTypeB with a certain value in attribute1, I have my String2Ineed. The ugly part is how to obtain String1Ineed, which is located in the first element of type level2ElementTypeB above the current level2ElementTypeC.
My 'imperative' solution looks like that that I always keep an variable with the last value of String1Ineed and if I hit criteria for String2Ineed, I simply use that. If we look at this from a plain collection processing point of view. How would you model the backtracking logic between String1Ineed and String2Ineed? Using the State Monad?

Isn't this what XPATH is for? You can find String2Ineed and then change the axis to search back for String1Ineed.

Related

Parsing XML and retrieving attributes from (nested?) elements

I am trying to get specific data from an XML file, namely X, Y coordinates that are appear, to my beginners eyes, attributes of an element called "Point" in my file. I cannot get to that data with anything other than a sledgehammer approach and would gratefully accept some help.
I have used the following successfully:
for Shooter in root.iter('Shooter'):
print(Shooter.attrib)
But if I try the same with "Point" (or "Points") there is no output. I cannot even see "Point" when I use the following:
for child in root:
print(child.tag, child.attrib)
So: the sledgehammer
print([elem.attrib for elem in root.iter()])
Which gives me the attributes for every element. This file is a single collection of data and could contain hundreds of data points and so I would rather try to be a little more subtle and home in on exactly what I need.
My XML file
https://pastebin.com/abQT3t9k
UPDATE: Thanks for the answers so far. I tried the solution posted and ended up with 7000 lines of which wasn't quite what I was after. I should have explained in more detail. I also tried (as suggested)
def find_rec(node, element, result):
for item in node.findall(element):
result.append(item)
find_rec(item, element, result)
return result
print(find_rec(ET.parse(filepath_1), 'Shooter', [])) #Returns <Element 'Shooter' at 0x125b0f958>
print(find_rec(ET.parse(filepath_1), 'Point', [])) #Returns None
I admit I have never worked with XML files before, and I am new to Python (but enjoying it). I wanted to get the solution myself but I have spent days getting nowhere.
I perhaps should have just asked from the beginning how to extract the XY data for each ShotNbr (in this file there is just one) but I didn't want code written for me.
I've managed to get the XY from this file but my code will never work if there is more than one shot, or if I want to specifically look at, say, shot number 20.
How can I find shot number 2 (ShotNbr="2") and extract only its XY data points?
Assuming that you are using:
xml.etree.ElementTree,
You are only looking at the direct children of root.
You need to recurse into the tree to access elements lower in the hierarchical tree.
This seems to be the same problem as ElementTree - findall to recursively select all child elements
which has an excellent answer that I am not going to plagiarize.
Just apply it.
Alternatively,
import xml.etree.ElementTree as ET
root = ET.parse("file.xml")
print root.findall('.//Point')
Should work.
See: https://docs.python.org/2/library/xml.etree.elementtree.html#supported-xpath-syntax

How to create element from API with name from appropriate sequence?

I'd like to use EA to generate Requirement elements programatically. I need to use the same sequence numbering (REQ00000xy), as with the GUI when pressing "Auto" button in "Add Element ..." dialog in order to keep´consistent numbering for Requirement elements created either from GUI or from API.
Selecting the last used sequence number from already existing Requierement elements won't help, as it don't move the sequence number up and next Requirement created from GUI .
Is there a way to get (and properly use) the sequence number via EA API or EA SQL?
The table you're looking for is t_trxtypes. This contains something like (EA's output)
Description;NumericWeight;Notes;TRX;TRX_ID;Style;
Autocount;1,00;prefix=bla;suffix=x;active=1;active_a=0;counter=126;;Class;1; ;
You're interested in the column Notes which holds as CSV list like
prefix=bla;suffix=x;active=1;active_a=0;counter=126;
This is a test setting for a class which currently has the number 126. So the next created class would be named bla126x and the entry would change to
prefix=bla;suffix=x;active=1;active_a=0;counter=127;
Just keep the columne t_trxtypes.notes in synch with your creations.
Note EA does not (seem to) allow direct DB access. However, it has a proven back door:
Repository.Execute("UPDATE t_trxtypes SET Notes='prefix=bla;suffix=x;active=1;active_a=0;counter=127;' WHERE TRX_ID=<your id>")
will do the update (replace <your id> with the appropriate key). Though Execute is undocumented it works ever since and for sure Sparx will not limit it as nowadays everyone relies on it.
As a side note: This counter is not safe. There are lots of ways (the easiest is a simple rename) to break it. You'd need some script/add-in to have regular checks your numbering is still consistent. If you rely on requirement numbering you better use an external system like, I dare to say, DOORS.
Finally, RTFM....
For elements, where sequence is defined, if you use empty name in set =AddNew() function, EA generates the sequence upon .Update(). Not earlier. So if you plan to use the generated sequence and add some description, you need to create the element with empty name first, then Update() it and after that append your description to the content of the Name field.
As easy as this.

Auto complete in text field in odoo

I would like to select a customer from the select box by start typing its phone number.
How can I do that?
I have seen some are using name_search method.But still i am confused how to use it in both front end and back end.
Or is there any other solution for this.
Override the name_search method of your model and the domain you want on the args variable. Take a look at addons/account/account.py around line 595 args += [('type', '=', type)] for a concrete implementation. Make sure that you return the appropriate data structure as documented in the method's docstring at openerp/models.py.
For Auto complete in odoo. It provides suggestion only in case of using Many2one field in any module.
If you want to show suggestion and autocomplete. Create a model to store the mobile numbers and then use that particular model as foreign key in existing model.
That will do for you.

dataFrame keying using pandas groupby method

I new to pandas and trying to learn how to work with it. Im having a problem when trying to use an example I saw in one of wes videos and notebooks on my data. I have a csv file that looks like this:
filePath,vp,score
E:\Audio\7168965711_5601_4.wav,Cust_9709495726,-2
E:\Audio\7168965711_5601_4.wav,Cust_9708568031,-80
E:\Audio\7168965711_5601_4.wav,Cust_9702445777,-2
E:\Audio\7168965711_5601_4.wav,Cust_7023544759,-35
E:\Audio\7168965711_5601_4.wav,Cust_9702229339,-77
E:\Audio\7168965711_5601_4.wav,Cust_9513243289,25
E:\Audio\7168965711_5601_4.wav,Cust_2102513187,18
E:\Audio\7168965711_5601_4.wav,Cust_6625625104,-56
E:\Audio\7168965711_5601_4.wav,Cust_6073165338,-40
E:\Audio\7168965711_5601_4.wav,Cust_5105831247,-30
E:\Audio\7168965711_5601_4.wav,Cust_9513082770,-55
E:\Audio\7168965711_5601_4.wav,Cust_5753907026,-79
E:\Audio\7168965711_5601_4.wav,Cust_7403410322,11
E:\Audio\7168965711_5601_4.wav,Cust_4062144116,-70
I loading it to a data frame and the group it by "filePath" and "vp", the code is:
res = df.groupby(['filePath','vp']).size()
res.index
and the output is:
[E:\Audio\7168965711_5601_4.wav Cust_2102513187,
Cust_4062144116, Cust_5105831247,
Cust_5753907026, Cust_6073165338,
Cust_6625625104, Cust_7023544759,
Cust_7403410322, Cust_9513082770,
Cust_9513243289, Cust_9702229339,
Cust_9702445777, Cust_9708568031,
Cust_9709495726]
Now Im trying to approach the index like a dict, as i saw in examples, but when im doing
res['Cust_4062144116']
I get an error:
KeyError: 'Cust_4062144116'
I do succeed to get a result when im putting the filepath, but as i understand and saw in previouse examples i should be able to use the vp keys as well, isnt is so?
Sorry if its a trivial one, i just cant understand why it is working in one example but not in the other.
Rutger you are not correct. It is possible to "partial" index a multiIndex series. I simply did it the wrong way.
The index first level is the file name (e.g. E:\Audio\7168965711_5601_4.wav above) and the second level is vp. Meaning, for each file name i have multiple vps.
Now, this is correct:
res['E:\Audio\7168965711_5601_4.wav]
and will return:
Cust_2102513187 2
Cust_4062144116 8
....
but trying to index by the inner index (the Cust_ indexes) will fail.
You groupby two columns and therefore get a MultiIndex in return. This means you also have to slice using those to columns, not with a single index value.
Your .size() on the groupby object converts it into a Series. If you force it in a DataFrame you can use the .xs method to slice a single level:
res = pd.DataFrame(df.groupby(['filePath','vp']).size())
res.xs('Cust_4062144116', level=1)
That works. If you want to keep it as a series, boolean indexing can help, something like:
res[res.index.get_level_values(1) == 'Cust_4062144116']
The last option is a bit less readable, but sometimes also more flexibile, you could test for multiple values at once for example:
res[res.index.get_level_values(1).isin(['Cust_4062144116', 'Cust_6073165338'])]

Asp.Net MVC 2: How exactly does a view model bind back to the model upon post back?

Sorry for the length, but a picture is worth 1000 words:
In ASP.NET MVC 2, the input form field "name" attribute must contain exactly the syntax below that you would use to reference the object in C# in order to bind it back to the object upon post back. That said, if you have an object like the following where it contains multiple Orders having multiple OrderLines, the names would look and work well like this (case sensitive):
This works:
Order[0].id
Order[0].orderDate
Order[0].Customer.name
Order[0].Customer.Address
Order[0].OrderLine[0].itemID // first order line
Order[0].OrderLine[0].description
Order[0].OrderLine[0].qty
Order[0].OrderLine[0].price
Order[0].OrderLine[1].itemID // second order line, same names
Order[0].OrderLine[1].description
Order[0].OrderLine[1].qty
Order[0].OrderLine[1].price
However we want to add order lines and remove order lines at the client browser. Apparently, the indexes must start at zero and contain every consecutive index number to N.
The black belt ninja Phil Haack's blog entry here explains how to remove the [0] index, have duplicate names, and let MVC auto-enumerate duplicate names with the [0] notation. However, I have failed to get this to bind back using a nested object:
This fails:
Order.id // Duplicate names should enumerate at 0 .. N
Order.orderDate
Order.Customer.name
Order.Customer.Address
Order.OrderLine.itemID // And likewise for nested properties?
Order.OrderLine.description
Order.OrderLine.qty
Order.OrderLine.price
Order.OrderLine.itemID
Order.OrderLine.description
Order.OrderLine.qty
Order.OrderLine.price
I haven't found any advice out there yet that describes how this works for binding back nested ViewModels on post. Any links to existing code examples or strict examples on the exact names necessary to do nested binding with ILists?
Steve Sanderson has code that does this sort of thing here, but we cannot seem to get this to bind back to nested objects. Anything not having the [0]..[n] AND being consecutive in numbering simply drops off of the return object.
Any ideas?
We found a work around, by using the following:
Html.EditorFor(m => m, "ViewNameToUse", "FieldPrefix")
Where FieldPrefix is the "object[0]". This is hardly ideal, but it certainly works pretty well. It's simple and elegant.