Given that my Purescript program contains different types representing items that can be exchanged, for example Vegetable, Milk, Meat etc., what is the best way to represent a ledger data-structure that tracks the exchanges between participants? To simplify we can represent participants as type Participant = Int.
You can use purescript-variant for open sums.
type Ledger products = Array (Entry products)
type Entry products =
{ date ∷ DateTime
, product ∷ Variant products
, unitPrice ∷ Int
, quantity ∷ Int
}
A Ledger type can then be instantiated for a particular set of products, such as Ledger (milk ∷ Milk, vegetable ∷ Vegetable, meat ∷ Meat).
Related
I wanted to write a method to process sales data in a way that the sales are sorted by date and concatenated with an entry number and a sale type like this:
0/2018-05-02 01:55:07/Sale type A,1/2018-09-22 02:55:07/Sale type B
But for now I could only achieve concatenating saleDate and saleType. How is it possible to produce an entry number for each record? By entry number I mean the order of sales after sorting by date
def concatSales(sales: Seq[Sale]): Seq[String] = {
sales
.sortWith(_.saleDate < _.saleDate)
.map(sale => s"$DELIMITER${sale.saleDate}$DELIMITER${sale.saleType}")
}
If you want to assign an index for each element, you can use zipWithIndex:
sales
.sortWith(_.saleDate < _.saleDate)
.zipWithIndex
.map {
case (sale, idx) => s"$idx: ..."
}
Note that you might want to use .sortBy instead of .sortWith since it looks simpler:
sales.sortBy(_.saleDate)
Scenario
Players take part in events. They should provide information if they're going to attend an event or not.
Problem to be solved
I want to select players that haven't provided information, if they plan to attend specific event.
Your goal
As being a beginner in the technologies used, I would appreciate your validation and recommendation for improvement of the solution suggested by me below.
Solution
Technologies: python, postgreSQL, and Pony ORM
Entity model in Pony ORM:
class Event(db.Entity):
_table_ = "event"
start = Required(date)
players = Set("Attendance")
class Player(db.Entity):
_table_ = "player"
first_name = Optional(str)
last_name = Required(str)
phone_number = Required(str)
email = Required(str)
events = Set("Attendance")
class Attendance(db.Entity):
_table_ = "attendance"
event = Required(Event)
player = Required(Player)
status = Required(bool)
PrimaryKey(event, player)
Idea:
Get list of players that provided the information if they attend the event
Get list of players that are not in list created in 1.
Current implementation of the idea:
players = select(p for p in Player if p not in select(p for p in Player
for a in Attendance if p == a.player and a.event == next_event))
It is possible to refactor your query to make it more simple. At first, we can replace explicit join of Player and Attendance inside the inner query to implicit join via attribute access:
select(p for p in Player if p not in select(p for p in Player
for attendance in p.events if attendance.event == next_event))
To simplify query further we can use attribute lifting by writing expression p.events.event:
select(p for p in Player if p not in select(
p for p in Player if next_event in p.events.event))
p.events expression returns a set of Attendance records. In Pony when you have a set instance this set has all attributes of its items, and when you access such attribute you will get a set of all values of corresponding items attribute. The value of expression p.events.event will be the set of all Event objects linked with particular player.
The next step to simplify the query is to replace generators to lambdas. This way the query looks a bit shorter:
Player.select(lambda p: p not in Player.select(
lambda p: next_event in p.events.event))
But the biggest simplification can be achieved if we realize that the inner query is unnecessary and rewrite query as:
Player.select(lambda p: next_event not in p.events.event)
I think this is the most concise way to write this query using PonyORM
from Slick documentation, it's clear how to make a single left join between two tables.
val q = for {
(t, v) <- titles joinLeft volumes on (_.uid === _.titleUid)
} yield (t, v)
Query q will, as expected, have attributes: _1 of type Titles and _2 of type Rep[Option[Volumes]] to cover for non-existing volumes.
Further cascading is problematic:
val q = for {
((t, v), c) <- titles
joinLeft volumes on (_.uid === _.titleUid)
joinLeft chapters on (_._2.uid === _.volumeUid)
} yield /* etc. */
This won't work because _._2.uid === _.volumeUid is invalid given _.uid being not existing.
According to various sources on the net, this shouldn't be an issue, but then again, sources tend to target different slick versions and 3.0 is still rather new. Does anyone have some clue on the issue?
To clarify, idea is to use two left joins to extract data from 3 cascading 1:n:n tables.
Equivalent SQL would be:
Select *
from titles
left join volumes
on titles.uid = volumes.title_uid
left join chapters
on volumes.uid = chapters.volume_uid
Your second left join is no longer operating on a TableQuery[Titles], but instead on what is effectively a Query[(Titles, Option[Volumes])] (ignoring the result and collection type parameters). When you join the resulting query on your TableQuery[Chapters] you can access the second entry in the tuple using the _2 field (since it's an Option you'll need to map to access the uid field):
val q = for {
((t, v), c) <- titles
joinLeft volumes on (_.uid === _.titleUid)
joinLeft chapters on (_._2.map(_.uid) === _.volumeUid)
} yield /* etc. */
Avoiding TupleN
If the _N field syntax is unclear, you can also use Slick's capacity for user-defined record types to map your rows alternatively:
// The `Table` variant of the joined row representation
case class TitlesAndVolumesRow(title: Titles, volumes: Volumes)
// The DTO variant of the joined row representation
case class TitleAndVolumeRow(title: Title, volumes: Volume)
implicit object TitleAndVolumeShape
extends CaseClassShape(TitlesAndVolumesRow.tupled, TitleAndVolumeRow.tupled)
Let's say I have a table such as the one below, that may or may not contain duplicates for a given field:
ID URL
--- ------------------
001 http://example.com/adam
002 http://example.com/beth
002 http://example.com/beth?extra=blah
003 http://example.com/charlie
I would like to write a Pig script to find only DISTINCT rows, based on the value of a single field. For instance, filtering the table above by ID should return something like the following:
ID URL
--- ------------------
001 http://example.com/adam
002 http://example.com/beth
003 http://example.com/charlie
The Pig GROUP BY operator returns a bag of tuples grouped by ID, which would work if I knew how to get just the first tuple per bag (perhaps a separate question).
The Pig DISTINCT operator works on the entire row, so in this case all four rows would be considered unique, which is not what I want.
For my purposes, I do not care which of the rows with ID 002 are returned.
I found one way to do this, using the GROUP BY and the TOP operators:
my_table = LOAD 'my_table_file' AS (A, B);
my_table_grouped = GROUP my_table BY A;
my_table_distinct = FOREACH my_table_grouped {
-- For each group $0 refers to the group name, (A)
-- and $1 refers to a bag of entire rows {(A, B), (A, B), ...}.
-- Here, we take only the first (top 1) row in the bag:
result = TOP(1, 0, $1);
GENERATE FLATTEN(result);
}
DUMP my_table_distinct;
This results in one distinct row per ID column:
(001,http://example.com/adam)
(002,http://example.com/beth?extra=blah)
(003,http://example.com/charlie)
I don't know if there is a better approach, but this works for me. I hope this helps others starting out with Pig.
(Reference: http://pig.apache.org/docs/r0.12.1/func.html#topx)
I have found that you can do this with a nested grouping and using LIMIT So using Arel's example:
my_table = LOAD 'my_table_file' AS (A, B);
-- Nested foreach grouping generates bags with same A,
-- limit bags to 1
my_table_distinct = FOREACH (GROUP my_table BY A) {
result = LIMIT my_table 1;
GENERATE FLATTEN(result);
}
DUMP my_table_distinct;
You can use
Apache DataFu™ (incubating)
FirstTupleFrom Bag
register datafu-pig-incubating-1.3.1.jar
define FirstTupleFromBag datafu.pig.bags.FirstTupleFromBag();
my_table_grouped = GROUP my_table BY A;
my_table_grouped_first_tuple = foreach my_table_grouped generate flatten(FirstTupleFromBag(my_table,null));
Consider two tables Bill and Product with a many to many relationship. How do you get all the bills for a particular product using Entity Sql?
Something like this
SELECT B FROM [Container].Products as P
OUTER APPLY P.Bills AS B
WHERE P.ProductID == 1
will produce a row for each Bill
Another option is something like this:
SELECT P, (SELECT B FROM P.Bills)
FROM [Container].Products AS P
WHERE P.ProductID == 1
Which will produce a row for each matching Product (in this case just one)
and the second column in the row will include a nested result set containing the bills for that product.
Hope this helps
Alex
You need to use some linq like this;
...
using (YourEntities ye = new YourEntities())
{
Product myProduct = ye.Product.First(p => p.ProductId = idParameter);
var bills = myProduct.Bill.Load();
}
...
This assumes that you have used the entitiy framework to build a model for you data.
The bills variable will hold a collection of Bill objects that are related to your product object.
Hope it helps.