Let's say I have a pair of conversion functions
string2int :: String -> Maybe Int
int2string :: Int -> String
I could represent these fairly easily using Optics.
stringIntPrism :: Prism String Int
However if I want to represent failure reason, I'd need to keep these as two separate functions.
string2int :: String -> Validation [ParseError] Int
int2string :: Int -> String`
For this simple example Maybe is perfectly fine, since we can always assume that a failure is a parse failure, thus we don't actually have to encode this using an Either or Validation type.
However imagine, in addition to my parsing Prism, I want to perform some validation
isOver18 :: Int -> Validation [AgeError] Int
isUnder55 :: Int -> Validation [AgeError] Int
It would be ideal to be able compose these things together, such that I could have
ageField = isUnder55 . isOver18 . string2Int :: ValidationPrism [e] String Int
This is fairly trivial to build by hand, however it seems like a common enough concept that there might be something lurking in the field of Lenses/Optics that does this already. Is there an existing abstraction that handles this?
tl;dr
Is there a standard way of implementing a partial lens / prism / iso that can be parameterised over an arbitrary functor instead of being tied directly to Maybe?.
I've used Haskell notation above since it's more straight forward, however I'm actually using Monocle in Scala to implement this. I would, however, be perfectly happy with an answer specific to i.e. ekmett's Lens library.
I have recently written a blog post about indexed optics; which explores a bit how we can do coindexed optics as well.
In short: Coindexed-optics are possible, but we have yet to do some further research there. Especially, because if we try to translate that approach into lens encoding of lenses (from Profunctor to VL) it gets even more hairy (but I think we can get away with only 7 type-variables).
And we cannot really do this without altering how indexed optics are currently encoded in lens. So for now, you'll better to use validation specific libraries.
To give a hint of the difficulties: When we try to compose with Traversals, should we have
-- like `over` but also return an errors for elements not matched
validatedOver :: CoindexedOptic' s a -> (a -> a) -> s -> (ValidationErrors, s)
or something else? If we could only compose Coindexed Prisms their value won't justify their complexity; they won't "fit" into optics framework.
Related
I am interested in implementing something like Freer Monads, more Extensible Effects in PureScript, but using rows rather than an open union (I suppose it is possible).
However, I wasn't able to define a kind without foreign import. I want to be able to do something like:
kind X
data Y :: # X -> Type -> Type
data Z :: X
Is that something I can do or should I look for another approach?
Nathan Faubion has an implementation of extensible effects, called purescript-run, using row polymorphism, variants and proxies.
In functional optics, a well-behaved prism (called a partial lens in scala, I believe) is supposed to have a set function of type 'subpart -> 'parent -> 'parent, where if the prism "succeeds" and is structurally compatible with the 'parent argument given, then it returns the 'parent given with the appropriate subpart modified to have the 'subpart value given. If the prism "fails" and is structurally incompatible with the 'parent argument, then it returns the 'parent given unmodified.
I'm wondering why the prism doesn't return a 'parent option (Maybe for Haskellers) to represent the pass/fail nature of the set function? Shouldn't the programmer be able to tell from the return type whether the set was "successful" or not?
I know there's been a lot of research and thought put into the realm of functional optics, so I'm sure there must be a definitive answer that I just can't seem to find.
(I'm from an F# background, so I apologize if the syntax I've used is a bit opaque for Haskell or Scala programmers).
I doubt there's one definitive answer, so I'll give you two here.
Origin
I believe prisms were first imagined (by Dan Doel, if my vague recollection is correct) as "co-lenses". Whereas a lens from s to a offers
get :: s -> a
set :: (s, a) -> s
a prism from s to a offers
coget :: a -> s
coset :: s -> Either s a
All the arrows are reversed, and the product, (,), is replaced by a coproduct, Either. So a prism in the category of types and functions is a lens in the dual category.
For simple prisms, that s -> Either s a seems a bit weird. Why would you want the original value back? But the lens package also offers type-changing optics. So we end up with
get :: s -> a
set :: (s, b) -> t
coget :: a -> s
coset :: t -> Either s b
Suddenly what we're getting back in the non-matching case may actually be a bit different! What's that about? Here's an example:
cogetLeft :: a -> Either a x
cogetLeft = Left
cosetLeft :: Either b x -> Either (Either a x) b
cosetLeft (Left b) = Right b
cosetLeft (Right x) = Left (Right x)
In the second (non-matching) case, the value we get back is the same, but its type has been changed.
Nice hierarchy
For both Van Laarhoven (as in lens) and profunctor style frameworks, both lenses and prisms can also stand in for traversals. To do that, they need to have similar forms, and this design accomplishes that. leftaroundabout's answer gives more detail on this aspect.
To answer the “why” – lenses etc. are pretty rigidly derived from category theory, so this is actually quite clear-cut – the behaviour you describe just drops out of the maths, it's not something anybody defined for any purpose but follows from far more general ideas.
Ok, that's not really satisfying.
Not sure if other languages' type systems are powerful enough to express this, but in principle and in Haskell, a prism is a special case of a traversal.
A traversal is a way to “visit” all occurences of “elements” within some “container”. The classical example is
mapM :: Monad m => (a -> m b) -> [a] -> m [b]
This is typically used like
Prelude> mapM print [1..4]
1
2
3
4
[(),(),(),()]
The focus here is on: sequencing the actions/side-effects, and gathering back the result in a container with the same structure as the one we started with.
What's special about a prism is simply that the containers are restricted to contain either one or zero elements† (whereas a general traversal can go over any number of elements). But the set operator doesn't know about that because it's strictly more general. The nice thing is that you can therefore use this on a lens, or a prism, or on mapM, and always get a sensible behaviour. But it's not the behaviour of “insert exactly once into the structure or else tell me if it failed”.
Not that this isn't a sensible operation, just it's not what lens libraries call “setting”. You can do it by explicitly matching and re-building:
set₁ :: Prism s a -> a -> s -> Maybe s
set₁ p x = case matching p x of
Left _ -> Nothing
Right a -> Just $ a ^. re p
†More precisely: a prism seperates the cases: a container may either contain one element, and nothing else apart from that, or it may have no element but possibly something unrelated.
There are a couple of bindings to moment.js I'd like to use for rendering time spans in my Halogen UI which have types something like
diffMins :: forall eff. Moment -> Moment -> Eff (now :: NOW | eff) Number
If I want to use this function in my UI like this:
H.span_ [H.text $ diffMins (fromEpoch_ 0) (fromEpoch_ myTimeStamp)]
But this is in Eff so I can't.
What I can do is call into moment with this function:
js:
exports.duration_ = function (millis) {
return moment.duration(millis).humanize();
};
ps:
foreign import duration_ :: Number -> String
humanizeMilliseconds :: Milliseconds -> String
humanizeMilliseconds (Milliseconds n) = duration_ n
My question (or several) then:
Is it "cheating" to call into javascript without saying it's an Eff. If not when is it considered ok and when not? I could squit either way and see these functions as side effecting or not.
If I couldn't have changed the way I'm calling moment, or indeed it is a bad idea, is there a way to do this in HTML?
It is indeed not possible to perform anything effectful during renders in Halogen, as HTML is only data and render is state -> HTML.
As Phil says in the comment, you don't have to use Eff in the signature of FFI functions though, if you're sure they perform no effects. In this case, it's probably safe, since it's basically arithmetic on dates - but there may be some locale-specific stuff going on? If so it's only a little bit dodgy, as at least it will always give the same result on the same machine, unless the OS clock is messed with. I'd be a little hesitant to accept that as being effect free, but if it was really a problem and I needed to do it I'd at least ensure the function is not exported so it can't be used anywhere else except in the exceptional circumstance.
You could just do this in the component eval somewhere though and store the value in the component state - myTimeStamp must already be in there, so you could compute this value at the same time? That way you're not recomputing a static value with each render too.
I thought I'd try writing a fast parser using FParsec and quickly realised that many returning a list is a seriously performance problem. Then I discovered an alternative that uses a ResizeArray in the docs:
let manyA2 p1 p =
Inline.Many(firstElementParser = p1,
elementParser = p,
stateFromFirstElement = (fun x0 ->
let ra = ResizeArray<_>()
ra.Add(x0)
ra),
foldState = (fun ra x -> ra.Add(x); ra),
resultFromState = (fun ra -> ra.ToArray()),
resultForEmptySequence = (fun () -> [||]))
let manyA p = manyA2 p p
Using this in my code instead makes it run several times faster. So why does FParsec use lists by default instead of ResizeArray?
Using the built-in F# list type as the result type for the sequence combinators makes the combinators more convenient to use in F# and arguably leads to more idiomatic client-side code. Since most F# developers value simplicity and elegance over performance (at least in my experience) using lists as the default seemed like the right choice when I designed the API. At the same time I tried to make it easy for users to define their own specialized sequence combinators.
Currently the sequence combinators that return a list also use a list internally for building up the sequence. This is suboptimal for sequences with more than about 2 elements, as the list has to be reversed before it is returned. However, I'm not sure whether changing the implementation would be worth the effort, since if your parser is performance sensitive and you're parsing long sequence you're better off not using lists at all.
I probably should add a section about using arrays instead of lists to the user's guide chapter on performance.
I am a former Java developer and I have recently watched the insightful and entertaining introduction to Scala for Java developers by professor Venkat Subramaniam (https://www.youtube.com/watch?v=LH75sJAR0hc).
A major point introduced is the elimination of declared types in lieu of "type inference". Presumably, this means the higher-order compiler recognizes the type I intend to use, by the context.
Being an application security expert by trade, the first thing I tried to do is break this type inference... Example:
// declare a function that returns the square of an input Int. The return type is to be inferred.
scala> val square = (x:Int) => x*x
square: Int => Int = <function1>
// I can see the compiler inferred an Int for the output value, which I do not agree with.
scala> square(2147483647)
res1: Int = 1
// integer overflow
My question is why did the compiler not see that "*" is an operator with a threat of overflow, and wrap the inputs in something a little more protective like a BigInteger?
According to the professor, I am supposed to forget about the internal implementation and just get on with my business logic. But after my quick demonstration I'm not so sure that Scala is safe for a programmer who doesn't understand what the compiler is doing with my methods.
I think #rightføld somewhat overstates how often overflows do or don't happen (particularly when considering an attacker who is actively trying to overflow you). But I agree with his basic point. Converting all math to BigInteger would almost certainly have created a massive performance impact over Java. For developers to choose such a language, they'd have to get something visible for that cost.
String objects have a much smaller performance overhead over cstrings for many operations. They also provide very visible benefits to the developer, which is why people use them, not security per se. There are many common things that string objects make easy to do over cstrings. BigInteger provides none of that. It requires exactly the same code at a fraction of the speed, but just won't overflow (a bug few developers see day to day, even if security guys see it more often).
The equivalent would have been a cstring (with strcmp, strcpy, strcat, etc.) that ran at a fraction of the speed, but just didn't require a null terminator. I don't think many people would have jumped to use that, either, no matter how much that would help security over null-terminated strings. And if the language required it, I don't see a lot of people anxious to use the language.
And as #rightføld suggests in the comments, interoperability with Java would be trashed, since most if not all numbers would wind up being BigInteger. You'd constantly be converting, which raises the same dangers of overflows while adding a lot of code complexity (and more performance impacts).
A from-scratch language might get away with ubiquitous BigInteger (like python) if the language had a lot of other compelling features, but it's a very hard thing to retrofit into a language that wants to be a natural transition from (and with) Java.
In addition to the above answers, I think this question misunderstands the purpose of type inference in a statically typed language. Type inference does not make the choices that you are referring to - promoting a Int to a BigInt. It is restricted to simply "inferring" the type of an expression based the the known types of subexpressions at compile time.
The * function in Int returns an Int when supplied with an Int input parameter
def *(x: Int): Int
In this case, since x is declared to be an Int, then x*x must be an Int based on the signature of *.
If we really wanted this behavior, we could define a function that promotes Int to BigInt when multiplying.
implicit class SafeInt(x: Int) {
def safeMult(a: Int): scala.math.BigInt = scala.math.BigInt(x)*a
}
Then when we can define a square with the desired property:
scala> val square = (x: Int) => x safeMult x
square: Int => scala.math.BigInt = <function1>
The compiler infers based on the methods available. Int has a method *(Int): Int that is, as far as the compiler knows, perfectly well defined; 2147483647*2147483647 is a perfectly good method call with the result 1, it doesn't throw ClassCastException or anything like that.
Why is the Int type written this way? Largely for Java/JVM compatibility; many parts of Scala have design compromises for the sake of Java compatibility. If you don't need that functionality, you might prefer to use Haskell or a similar language. (I suspect that even without the requirement for JVM compatibility, Scala would have wanted to expose the machine-native integer types so that users could make that performance/correctness tradeoff where desired. They might not have been the default though)
If you're doing numeric computation in Scala you probably want to use the Spire library, which makes it easy to abstract over numeric types, and provides several high-performance numeric types with particular properties. In particular it has a SafeLong type that handles arbitrary-precision integers but with much better performance than BigInt for values which fall within the Long range, similar to Python's integer type.
Because overflow occurs almost never in practice, and BigInteger is slow as a dog compared to Int. It is also most inconvenient to have all * operations on Ints return BigIntegers.
"Recognizes the type I intend to use" is not an accurate description of what scala tries to do. It infers the most generic type possible given the constraints imposed by the context. Hence if you write List(Nil, "1"), you'll get List[Serializable], because Serializable is an interface that List and String share - disregarding that Serializable was probably not on your mind at all.
The question you're asking could be asked more precisely as "why is Int the type of numeric literals instead of BigInteger?" - inference doesn't have much to do with it.
And we can opine all we want on that topic, but there's one most accurate answer describing why Scala is what it is: "because Java".
If you wanted the type of safety that you seem to want, then one approach is to define via a partial function which guards against numeric overflow and then returns either an Option[Int] or even perhaps an Either[Int, BigInteger].
The type inference for your square function is correct - given that it's inferred from the input types you've specified and the type of the * function...it's not really broken in my opinion.