As pandas supports dataframe to json conversion and the dataframe can be converted to a json data as below: (1) and 2) are just for references nothing to do with sapui5,
1) for eg:
import pandas as pd
df = pd.DataFrame([['madrid', 10], ['venice', 20],['milan',40],['las vegas',35]],columns=['city', 'temp'])
df.to_json(orient="records")
gives:
[{"city":"madrid","temp":10},{"city":"venice","temp":20},{"city":"milan","temp":40},{"city":"las vegas","temp":35}]
and
df.to_json(orient="split")
gives:
{"columns":["city","temp"],"index":[0,1,2,3],"data":[["madrid",10],["venice",20],["milan",40],["las vegas",35]]}
As we have json data , this data could be used as input to plot properties.
2)for the same json data I have created an API (running on localhost):
http://127.0.0.1:****/graph
API using in flask:(just for refernce)
from flask import Flask
import pandas as pd
app=Flask(__name__)
#app.route('/graph')
def plot():
df=pd.DataFrame([['madrid', 10], ['venice', 20], ['milan', 40], ['las vegas', 35]],
columns=['city', 'temp'])
jsondata=df.to_json(orient='records')
return jsondata
if __name__=='__main__':
app.run()
postman result:
[
{
"city": "madrid",
"temp": 10
},
{
"city": "venice",
"temp": 20
},
{
"city": "milan",
"temp": 40
},
{
"city": "las vegas",
"temp": 35
}
]
3)How can I make use of this sample api to fetch data and then plot a sample graph for {city vs temp} using sapui5 ??
looking for an example to do so, (or) any help on how to make use of api's in sapui5 ?
Related
I'm trying to convert this json string data to Dataframe in Databricks
a = """{ "id": "a",
"message_type": "b",
"data": [ {"c":"abcd","timestamp":"2022-03-
01T13:10:00+00:00","e":0.18,"f":0.52} ]}"""
the schema I defined for the data is this
schema=StructType(
[
StructField("id",StringType(),False),
StructField("message_type",StringType(),False),
StructField("data", ArrayType(StructType([
StructField("c",StringType(),False),
StructField("timestamp",StringType(),False),
StructField("e",DoubleType(),False),
StructField("f",DoubleType(),False),
])))
,
]
)
and when I run this command
df = sqlContext.createDataFrame(sc.parallelize([a]), schema)
I get this error
PythonException: 'TypeError: StructType can not accept object '{ "id": "a",\n"message_type": "JobMetric",\n"data": [ {"c":"abcd","timestamp":"2022-03- \n01T13:10:00+00:00","e":0.18,"f":0.52=} ]' in type <class 'str'>'. Full traceback below:
anyone could help me with this, would much appreciate it!
Your a variable is wrong.
"data": [ "{"JobId":"ATLUPS10m2101V1","Timestamp":"2022-03-
01T13:10:00+00:00","number1":0.9098145961761475,"number2":0.5294908881187439}" ]
Should be
"data": [ {"JobId":"ATLUPS10m2101V1","Timestamp":"2022-03-
01T13:10:00+00:00","number1":0.9098145961761475,"number2":0.5294908881187439} ]
And check if it is OK to match name with JobId to job_id and Timestamp to timestamp.
Issue is whenever you're passing the string object to struct schema it expects RDD([StringType, StringType,...]) however, in your current scenario it is getting just string object. In order to fix it first you need to convert your string to a json object and from there you'll need to create a RDD. See the below logic for details -
Input Data -
a = """{"run_id": "1640c68e-5f02-4f49-943d-37a102f90146",
"message_type": "JobMetric",
"data": [ {"JobId":"ATLUPS10m2101V1","timestamp":"2022-03-01T13:10:00+00:00",
"score":0.9098145961761475,
"severity":0.5294908881187439
}
]
}"""
Converting to a RDD using json object -
from pyspark.sql.types import *
import json
schema=StructType(
[
StructField("run_id",StringType(),False),
StructField("message_type",StringType(),False),
StructField("data", ArrayType(StructType([
StructField("JobId",StringType(),False),
StructField("timestamp",StringType(),False),
StructField("score",DoubleType(),False),
StructField("severity",DoubleType(),False),
])))
,
]
)
df = spark.createDataFrame(data=sc.parallelize([json.loads(a)]),schema=schema)
df.show(truncate=False)
Output -
+------------------------------------+------------+--------------------------------------------------------------------------------------+
|run_id |message_type|data |
+------------------------------------+------------+--------------------------------------------------------------------------------------+
|1640c68e-5f02-4f49-943d-37a102f90146|JobMetric |[{ATLUPS10m2101V1, 2022-03-01T13:10:00+00:00, 0.9098145961761475, 0.5294908881187439}]|
+------------------------------------+------------+--------------------------------------------------------------------------------------+
I want to mock mongo in order to make some unit test with unittest for Flask. The doc about this is so huge and I don't really understand how to make it.
I want to test a POST method with the following data:
from unittest import TestCase, main as unittest_main, mock
from bson.objectid import ObjectId
from app import app
sample_user = {
'Id': ObjectId('5d55cffc4a3d4031f42827a3'),
'Username': 'LeTest',
'Mail': 'sendme#gmail.com',
'password': 'test123',
'Qrcode': 'TODO'
}
Can you explain me how I can test if the sample_user where added to my mongo collection ?
Thx !
I found the answer:
Here you have my code in order to mock mongoDB Data with Flask
def test_post_food(self):
# Mock the food value in ./api.food.py
with unittest.mock.patch('api.food.food') as MockFood:
# Force the return value of food.insert_one(json) to sample_food
MockFood.insert_one.return_value = sample_food
with self.client.post("/api/addFood", json=sample_food[0]) as res:
# Check if food.insert_one(json) was called
MockFood.insert_one.assert_called()
self.assertEqual(res.status_code, 200)
self.assertEqual(res.data, b'{"Response":"Food was added"}\n')
sample_food = [{
"_id": {
"$oid": "619e8f45ee462d6d876bbdbc"
},
'Utilisateur': "999",
'Nom': 'Danette Vanille',
'Marque': 'Danone',
'Quantite': 4,
'ingredients': [
'lait entier',
'lait écrémé reconstitué à base de lait en poudre',
'sucre',
'crème',
'lait écrémé concentré ou en poudre',
'épaississants (amidon modifié, carraghénanes)',
'perméat de petit lait (lactosérum) en poudre',
'amidon',
'arôme (lait)',
'colorant (bêta-carotène)'
],
'Date': '20/12/2021',
'Valeurs': {
'Energie': '107 kcal',
'Matières grasses': '3,0g',
'Glucides': '17,1g',
'Proteines': '3g',
'Sel': '0,14g'
},
'Poids': '125g',
'Lieu': 'Frigo',
'Category': "Produit laitiers"
}]
I have a dataframe which contains some column and json string:
val df = Seq (
(0, """{"device_id": 0, "device_type": "sensor-ipad", "ip": "68.161.225.1", "cca3": "USA", "cn": "United States", "temp": 25, "signal": 23, "battery_level": 8, "c02_level": 917, "timestamp" :1475600496 }"""),
(1, """{"device_id": 1, "device_type": "sensor-igauge", "ip": "213.161.254.1", "cca3": "NOR", "cn": "Norway", "temp": 30, "signal": 18, "battery_level": 6, "c02_level": 1413, "timestamp" :1475600498 }""")
).toDF("id", "json")
Which I want to save as json - without a nested json string in it but a 'raw' one instead.
When I
df.write.json("path")
It saves my json column as string:
{"id":0,"json":"{\"device_id\": 0, \"device_type\": \"sensor-ipad\", \"ip\": \"68.161.225.1\", \"cca3\": \"USA\", \"cn\": \"United States\", \"temp\": 25, \"signal\": 23, \"battery_level\": 8, \"c02_level\": 917, \"timestamp\" :1475600496 }"}
And what I need is:
{"id": 0,"json": {"device_id": 0,"device_type": "sensor-ipad","ip": "68.161.225.1","cca3": "USA","cn": "United States","temp": 25,"signal": 23,"battery_level": 8,"c02_level": 917,"timestamp": 1475600496}}
How can I achieve it? Please not that the structure of json could be different for each row, it can contain additional fields.
You can use from_json function to get the json string data as a new column
// get schema of the json data
// You can also define your own schema
import org.apache.spark.sql.functions._
val json_schema = spark.read.json(df.select("json").as[String]).schema
val resultDf = df.withColumn("json", from_json($"json", json_schema))
Output:
{"id":0,"json":{"battery_level":8,"c02_level":917,"cca3":"USA","cn":"United States","device_id":0,"device_type":"sensor-ipad","ip":"68.161.225.1","signal":23,"temp":25,"timestamp":1475600496}}
{"id":1,"json":{"battery_level":6,"c02_level":1413,"cca3":"NOR","cn":"Norway","device_id":1,"device_type":"sensor-igauge","ip":"213.161.254.1","signal":18,"temp":30,"timestamp":1475600498}}
How to Extract value from Cloudant IBM Bluemix NoSQL Database stored in JSON format?
I tried this code
def readDataFrameFromCloudant(host,user,pw,database):
cloudantdata=spark.read.format("com.cloudant.spark"). \
option("cloudant.host",host). \
option("cloudant.username", user). \
option("cloudant.password", pw). \
load(database)
cloudantdata.createOrReplaceTempView("washing")
spark.sql("SELECT * from washing").show()
return cloudantdata
hostname = ""
user = ""
pw = ""
database = "database"
cloudantdata=readDataFrameFromCloudant(hostname, user, pw, database)
It is stored in this format
{
"_id": "31c24a382f3e4d333421fc89ada5361e",
"_rev": "1-8ba1be454fed5b48fa493e9fe97bedae",
"d": {
"count": 9,
"hardness": 72,
"temperature": 85,
"flowrate": 11,
"fluidlevel": "acceptable",
"ts": 1502677759234
}
}
I want this result
Expected
Actual Outcome
Create a dummy dataset for reproducing the issue:
cloudantdata = spark.read.json(sc.parallelize(["""
{
"_id": "31c24a382f3e4d333421fc89ada5361e",
"_rev": "1-8ba1be454fed5b48fa493e9fe97bedae",
"d": {
"count": 9,
"hardness": 72,
"temperature": 85,
"flowrate": 11,
"fluidlevel": "acceptable",
"ts": 1502677759234
}
}
"""]))
cloudantdata.take(1)
Returns:
[Row(_id='31c24a382f3e4d333421fc89ada5361e', _rev='1-8ba1be454fed5b48fa493e9fe97bedae', d=Row(count=9, flowrate=11, fluidlevel='acceptable', hardness=72, temperature=85, ts=1502677759234))]
Now flatten:
flat_df = cloudantdata.select("_id", "_rev", "d.*")
flat_df.take(1)
Returns:
[Row(_id='31c24a382f3e4d333421fc89ada5361e', _rev='1-8ba1be454fed5b48fa493e9fe97bedae', count=9, flowrate=11, fluidlevel='acceptable', hardness=72, temperature=85, ts=1502677759234)]
I tested this code with an IBM Data Science Experience notebook using Python 3.5 (Experimental) with Spark 2.0
This answer is based on: https://stackoverflow.com/a/45694796/1033422
I am new to pandas (well, to all things "programming"...), but have been encouraged to give it a try.
I have a mongodb database - "test" - with a collection called "tweets".
I access the database in ipython:
import sys
import pymongo
from pymongo import Connection
connection = Connection()
db = connection.test
tweets = db.tweets
the document structure of documents in tweets is as follows:
entities': {u'hashtags': [],
u'symbols': [],
u'urls': [],
u'user_mentions': []},
u'favorite_count': 0,
u'favorited': False,
u'filter_level': u'medium',
u'geo': {u'coordinates': [placeholder coordinate, -placeholder coordinate], u'type': u'Point'},
u'id': 349223842700472320L,
u'id_str': u'349223842700472320',
u'in_reply_to_screen_name': None,
u'in_reply_to_status_id': None,
u'in_reply_to_status_id_str': None,
u'in_reply_to_user_id': None,
u'in_reply_to_user_id_str': None,
u'lang': u'en',
u'place': {u'attributes': {},
u'bounding_box': {u'coordinates': [[[placeholder coordinate, placeholder coordinate],
[-placeholder coordinate, placeholder coordinate],
[-placeholder coordinate, placeholder coordinate],
[-placeholder coordinate, placeholder coordinate]]],
u'type': u'Polygon'},
u'country': u'placeholder country',
u'country_code': u'example',
u'full_name': u'name, xx',
u'id': u'user id',
u'name': u'name',
u'place_type': u'city',
u'url': u'http://api.twitter.com/1/geo/id/1820d77fb3f65055.json'},
u'retweet_count': 0,
u'retweeted': False,
u'source': u'Twitter for iPhone',
u'text': u'example text',
u'truncated': False,
u'user': {u'contributors_enabled': False,
u'created_at': u'Sat Jan 22 13:42:59 +0000 2011',
u'default_profile': False,
u'default_profile_image': False,
u'description': u'example description',
u'favourites_count': 100,
u'follow_request_sent': None,
u'followers_count': 100,
u'following': None,
u'friends_count': 100,
u'geo_enabled': True,
u'id': placeholder_id,
u'id_str': u'placeholder_id',
u'is_translator': False,
u'lang': u'en',
u'listed_count': 0,
u'location': u'example place',
u'name': u'example name',
u'notifications': None,
u'profile_background_color': u'000000',
u'profile_background_image_url': u'http://a0.twimg.com/images/themes/theme19/bg.gif',
u'profile_background_image_url_https': u'https://si0.twimg.com/images/themes/theme19/bg.gif',
u'profile_background_tile': False,
u'profile_banner_url': u'https://pbs.twimg.com/profile_banners/241527685/1363314054',
u'profile_image_url': u'http://a0.twimg.com/profile_images/378800000038841219/8a71d0776da0c48dcc4ef6fee9f78880_normal.jpeg',
u'profile_image_url_https': u'https://si0.twimg.com/profile_images/378800000038841219/8a71d0776da0c48dcc4ef6fee9f78880_normal.jpeg',
u'profile_link_color': u'000000',
u'profile_sidebar_border_color': u'FFFFFF',
u'profile_sidebar_fill_color': u'000000',
u'profile_text_color': u'000000',
u'profile_use_background_image': False,
u'protected': False,
u'screen_name': placeholder screen_name',
u'statuses_count': xxxx,
u'time_zone': u'placeholder time_zone',
u'url': None,
u'utc_offset': -21600,
u'verified': False}}
Now, as far as I understand, pandas' main data structure - a spreadsheet-like table - is called DataFrame. How can I load the data from my "tweets" collection into pandas' DataFrame? And how can I query for a subdocument within the database?
Comprehend the cursor you got from the MongoDB before passing it to DataFrame
import pandas as pd
df = pd.DataFrame(list(tweets.find()))
If you have data in MongoDb like this:
[
{
"name": "Adam",
"age": 27,
"address":{
"number": 4,
"street": "Main Road",
"city": "Oxford"
}
},
{
"name": "Steve",
"age": 32,
"address":{
"number": 78,
"street": "High Street",
"city": "Cambridge"
}
}
]
You can put the data straight into a dataframe like this:
from pandas import DataFrame
df = DataFrame(list(db.collection_name.find({}))
And you will get this output:
df.head()
| | name | age | address |
|----|---------|------|-----------------------------------------------------------|
| 1 | "Steve" | 27 | {"number": 4, "street": "Main Road", "city": "Oxford"} |
| 2 | "Adam" | 32 | {"number": 78, "street": "High St", "city": "Cambridge"} |
However the subdocuments will just appear as JSON inside the subdocument cell. If you want to flatten objects so that subdocument properties are shown as individual cells you can use json_normalize without any parameters.
from pandas.io.json import json_normalize
datapoints = list(db.collection_name.find({})
df = json_normalize(datapoints)
df.head()
This will give the dataframe in this format:
| | name | age | address.number | address.street | address.city |
|----|--------|------|----------------|----------------|--------------|
| 1 | Thomas | 27 | 4 | "Main Road" | "Oxford" |
| 2 | Mary | 32 | 78 | "High St" | "Cambridge" |
You can load your MongoDB data to pandas DataFame using this code. It works for me.
import pymongo
import pandas as pd
from pymongo import Connection
connection = Connection()
db = connection.database_name
input_data = db.collection_name
data = pd.DataFrame(list(input_data.find()))
Use:
df=pd.DataFrame.from_dict(collection)
This is the simplest technique to achieve your aim.
import pymongo
import pandas as pd
from pymongo import Connection
conn = Connection()
db = conn.your_database_name
input_data = db.your_collection_name
pandas_data_frame = pd.DataFrame(list(input_data.find()))
print(pandas_data_frame)