Protobuf Trip Updates Full - real-time

I am writing and reading data to and from .pb files
But how do I get to write an output like the example that shows the documentation from my generated FeedMessage class?
Example sample:
https://developers.google.com/transit/gtfs-realtime/examples/trip-updates-full

Related

Pick up changes in json files that are being read by pyspark readstream?

I have json files where each file describes a particular entity, including it's state. I am trying to pull these into Delta by using readStream and writeStream. This is working perfectly for new files. These json files are frequently updated (i.e., states are changed, comments added, history items added, etc.). The changed json files are not pulled in with the readStream. I assume that is because readStream does not reprocess items. Is there a way around this?
One thing I am considering is changing my initial write of the json to add a timestamp to the file name so that it becomes a different record to the stream (I already have to do a de-duping in my writeStream anyway), but I am trying to not modify the code that is writing the json as it is already being used in production.
Ideally I would like to find something like the changeFeed functionality for Cosmos Db, but for reading json files.
Any suggestions?
Thankss!
This is not supported by the Spark Structured Streaming - after file is processed it won't be processed again.
The closest to your requirement is only exist in Databricks' Autoloader - it has option cloudFiles.allowOverwrites option that allows to reprocess modified files.
P.S. Potentially if you use cleanSource option for file source (https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#input-sources), then it may reprocess files, but I'm not 100% sure.

Sample geopandas data in networkx example

I am trying an example provided by networkx here. This example uses a geopandas dataset in a filename cholera_cases.pkdg.
# read in example data from a geopackage file. Geopackages
# are a format for storing geographic data that is backed
# by sqlite. geopandas reads data relying on the fiona package,
# providing a high-level pandas-style interface to geographic data.
cases = geopandas.read_file("cholera_cases.gpkg")
The example, however, does not mention where and how to obtain this dataset. I combed GeoPandas website up and down and am unable to locate this file. I want to view the format of its content and run this example.
If anyone is aware of where to obtain this kind of geopandas file, please advise.
If you go to their GitHub, you can find it in their repo. Here: https://github.com/networkx/networkx/tree/main/examples/geospatial
Might be worth cloning the repo to play with the examples.
As general advice, on webpages for projects like these, I like to always check for links to their GitHub/GitLab/other because you get to see the project behind the scenes, and a local clone can be kept up to date

How to add an attribute in a json file via thmap?

I am a beginner on Talend, I have a problem processing a json file via talend. I have a json file with several levels and containing tables on different levels (or depths) of json. I just want to add an attribute in a json area located at a given depth via thmap. So in input I have the json file and in output the same json file with the new attribute. I have no idea how to configure the thmap although it is dedicated to simplify complex mappings.
difficult to answer without more information can you create a screen grab of your TMAP usually it's quite simple in the output field to on the left cell you add it there

Exporting NetLogo data to graph with nodes and edges

I have created some links between agents (turtles) in NetLogo. This links will change at each time step. My aim is to export this data (i.e., turtles and links b/w them) to graph with vertices (turtles) edges (links), which can be given as input to Gephi. Is it possible to see the changes which occurs in netlogo in the graph when it is linked with Gephi. Can someone help me out. Thanks.
To export your network data in a format usable by Gephi, I would suggest using the nw:save-graphml primitive from NetLogo's NW Extension. This will give produce a file in the GraphML file format, which Gephi can read.
I guess you could re-save your network at each time step and overwrite your file, but I'm not sure if Gephi can display your changes dynamically. And depending on the size of your network, it might be slow.
Are you trying to use Gephi to see how the network changes over time, in a changing network that is generated by NetLogo? That's what #NicolasPayette's answer suggests, so I'll make the same assumption.
Gephi can display "dynamic graphs", i.e. networks that change over time. My understanding is that are two file formats that allow Gephi to import dynamic graphs: GEXF, and a special CSV (comma-separated) format that Gephi calls "Spreadsheet". Nicolas mentioned GraphML, which is a very nice network data format, but it doesn't handle dynamic graphs. And as far as I know, NetLogo doesn't generate GEXF or Gephi's "Spreadsheet" format.
However, the Gephi Spreadsheet format is very simple, and it would not be difficult to write a NetLogo procedure that would write a file in that format. This procedure would write new rows to the "Spreadsheet" CSV file on each NetLogo tick. Then Gephi could read in the file, and you'd be able to move back and forth in time, seeing how the graph changes. (You might need to use a bit of trial and error to figure out how to write Spreadsheet files based on the description on the Gephi site.)
Another option would be to display the evolving graph online using the graphstream protocol. Plugins for NetLogo as well as for gephi provide support for this.

Openstreetmap Xapi filters

Is there anyway to filter the results returned using xapi so that I don't have a ton of results to work through? I thought something like [filter=tag] might only show tags but I can't seem to find any documentation saying this is possible.
Thanks
(1) The standard way of using XAPI lets you filter to retrieve only objects tagged with a certain tag, for example just to get pubs you'd use:
http://jxapi.osm.rambler.ru/xapi/api/0.6/*[amenity=pub]
(2) If you want to filter an OSM file after you've downloaded it (e.g. to remove certain tags), Osmosis is a command-line tool that can do various types of filtering.
(3) If you want to filter an OSM file into some other format (i.e. you're not interested in having an OSM-format XML file at the end) you could use XSLT. Here is an XSLT I made which extracts a small number of pub parameters from an OSM file to CSV.