{-# LANGUAGE FlexibleContexts, InstanceSigs, ScopedTypeVariables,
             TemplateHaskell, GeneralizedNewtypeDeriving #-}

{- | Fifth assignment for IB016, semester spring 2019.

== Effective Transformation of a Logical Formula to Conjunctive Normal Form

In this homework you work will be to implement an effective tranformation of a
propositional logic formula to Conjunctive Normal Form (CNF). You probably
know from the basic logic course that for every propositional formula there is
an equivalent formula in CNF. The problem with the basic transformation is that
there can be an exponential blowup, i.e. the size of the resulting formula can
be exponential compared to the size of the original formula. This is very
impractical for example for SAT solvers (which are algorithms which aim at
checking if a formula is satisfiable and try to be efficient for large classes
of propositional formulas): these solvers usually require input formula to be
in CNF and their runtime can be up to exponential in the size of the input
formula. Therefore a SAT solver with naive conversion to CNF would have double
exponential blowup.

There is another approach to conversion to CNF: we can create a formula in CNF
which is satisfiable if and only if original formula was satisfiable (i.e. is
/equisatisfiable/) and which has only a linear overhead. The trick is to use
auxiliary variables in the transformation to represent subformulas of the
original formula. This transformation is called Tseytin transformation and is
described for example on Wikipedia:
<https://en.wikipedia.org/wiki/Tseytin_transformation>. We do not require that
you use Tseytin transformation in particular, however, your transformation must
guarantee that the size of the resulting formula is linear with respect to the
size of the original formula.

Your goal will be to implement functions 'eval' and 'getVariables' for boolean
formulas represented by a tree data type, as well as those represented by 'CNF'
(which is basically a list of lists of literals). Furthermore, you will be
required to implement 'toCNF' which converts a tree-represented formula to CNF.
Finally there is a small helper function 'eval'' which you should also
implement. Your implementation should use monad transformers, see later.
There are some tests prepared for you, feel free to write your own tests on top
of these.

=== Monad Transformers

In implementing the required functionality you are strongly advised to use
monad transformers.

* 'eval': this function operates within 'MonadReader' and 'MonadError' and
  therefore you must use their functionality in the implementation.
* 'getVariables' in the version for 'FormulaTree': think about a good use of
  'MonadWriter' / 'Writer' in this function. You are strongly advised to use it
  (and it __will be required to get full points__).
* 'toCNF': think about which monads\/monad transformers will be useful for this
  task. There are several aspects of the conversion to CNF which map well to
  different monads\/transformers shown in the class. You will still get some
  points if you don't use any of them, but your work will probably be harder
  then if you choose your monads well.

=== Tips & Tricks

* Think before implementing. Especially about monads and transformers shown on
  the 10th & 11th lecture.
* Have a look at the <https://hackage.haskell.org/package/mtl/ mtl> package.
* Use ghcid (or an IDE with similar functionality) for type checking & testing:
  @ghcid -c 'ghci -Wall' -T testAll HW05.hs@ -- this will re-compile your
  solution every time you change it, and show you either compilation
  errors/warnings or test results. Ghcid can be installed either from your
  distribution's package if you have it, or from cabal.
* Use
 <https://downloads.haskell.org/~ghc/7.8.1/docs/html/users_guide/typed-holes.html typed holes>
 when you don't know the type of an subexpression which you need to write.
* Don't forget to use @hlint@ & check compilation on aisa/ghc 8.6.

=== Notes on Used Language Extensions

There are several language extensions enabled in this homework. If you want to
enable some more, please ask in discussion forum.

* @FlexibleContexts@ -- this allows usage of concrete types (with kind @*@) as
  parts of the context (e.g. in @'MonadReader' 'Valuation' m@ 'Valuation'
  cannot be used without this extension).
* @InstanceSigs@ -- allows you to specify explicitly a type of a function in
  an instance definition.
* @ScopedTypeVariables@ -- this, together with @ExplicitForAll@ which it
  implies, allows you to use type variables from the functions' type in its
  nested functions, e.g.:

    @
    foo :: forall m a. Monad m => (a -> m a) -> m [a]
    foo = ...
      where
        bar :: a -> m a
        bar = ...
    @

    Here @m@ and @a@ in definition of @bar@ represent the same type as in @foo@
    (while normally, this would define a polymorphic @bar@ and therefore @m@
    would not need to be a monad).

* @TemplateHaskell@ is used for generation of 'testAll' function.
* @GeneralizedNewtypeDeriving@ allows us to lift any instance available on some
  type to @newtype@s over this type (see e.g. 'CNF', 'Clause').

=== Modules and packages

You can use any module from packages
<https://hackage.haskell.org/package/base/ base>,
<https://hackage.haskell.org/package/containers/ containers>,
and <https://hackage.haskell.org/package/mtl/ mtl> (which includes
<https://hackage.haskell.org/package/transformers/ transformers>).
If you wish so, you can also use Unicode syntax from
<https://hackage.haskell.org/package/unicode-prelude unicode-prelude>.

-}

module HW05
    ( Formula (..)
    , Valuation, insert
    , Varname
    , eval'
    , Pretty (..)
    , FormulaTree (..)
    , Op (..)
    , Literal (..), Clause (..), CNF (..)
    , toCNF
    -- * Tests
    , testAll
    -- * Misc helpers
    , anyM
    , foldMapM
    ) where

import Control.Monad.Writer
import Control.Monad.Reader
import Control.Monad.Except
import Data.Functor ( (<&>) )

import Data.List ( intercalate )
import Data.Map ( Map )
import Data.Set ( Set )
import qualified Data.Map as M
import qualified Data.Set as S

import Test.QuickCheck

type Varname = String

-- | Class for formulas which can have different representation.
class Formula f where
    -- | A formula can be evaluated in a monad which allows reading of
    -- valuations and reporting of string errors.
    eval :: (MonadError String m, MonadReader Valuation m) => f -> m Bool

    -- | Get all free variables used in the formula.
    getVariables :: f -> Set Varname

-- | Boolean binary operations used in formulas.
-- Note that the 'Enum' instance is useful for QuickCheck.
data Op = And | Or | Implies | Equiv
        deriving ( Eq, Show, Read, Enum )

-- | A tree representation of formula.
-- @Let@ can be used to name a subtree and re-use this name multiple times.
-- We say that variable @v@ is free in the formula if it is used in it, and
-- there is at least one use which is not inside a scope of a let
-- subexpression. E.g. in formula @(a && let x = b || c in x && !b)@ variables
-- @a@, @b@, @c@ are free, while @x@ is not free. Note that the scope of
-- a @let@-bound variable is only in the @in@ part of @let@. E.g. in the
-- formula @(let x = x in x)@ is equivalent to @(let y = x in y)@, i.e. there
-- are two distinct variables named @x@.
data FormulaTree = Var Varname
                 | Not FormulaTree
                 | BinOp Op FormulaTree FormulaTree
                 | Let Varname FormulaTree FormulaTree
                 deriving ( Eq, Show, Read )

-- | A literal is either a (positive) variable, or negation of variable.
data Literal = Pos Varname | Neg Varname deriving ( Eq, Show, Read )

-- | A Clause is a disjunction of literals.
-- Note that we can derive monoid for newtypes over monoids thanks to
-- the @GeneralizedNewtypeDeriving@ extension (it allows us to "lift" any
-- instance from a base type to a newtype over the given type).
newtype Clause = Clause { getLiterals :: [Literal] }
               deriving ( Eq, Show, Read, Semigroup, Monoid )

-- | A CNF formula is a conjunction of Clauses.
newtype CNF = CNF { getClauses :: [Clause] }
              deriving ( Eq, Show, Read, Semigroup, Monoid )

-- | A valuation is a (partial) mapping from variable names to boolean values.
newtype Valuation = Valuation { getValuation :: Map Varname Bool }
                    deriving ( Show, Eq, Semigroup, Monoid )

-- | A 'foldMap' lifted into a monad.
foldMapM :: (Monad m, Monoid b) => (a -> m b) -> [a] -> m b
foldMapM f xs = mconcat <$> mapM f xs

-- | Insert into a valuation.
insert :: Varname -> Bool -> Valuation -> Valuation
insert var bv (Valuation mv) = Valuation $ M.insert var bv mv

-- | Evaluation should be strict in the sense that it should produce an error
-- if any free variable in the formula is not defined in the valuation. This
-- should be true even for variables used in definition of unused @let@ bindings:
-- evaluation of formula @(let x = a in b)@ with valuation which assigns value
-- to @b@ but not @a@ should produce an error.
-- 'getVariables' should produce set of all free variables.
--
-- Hint: you can use @Writer@ in 'getVariables'.
instance Formula FormulaTree where
    {- Fill in here -}

instance Formula CNF where
    {- Fill in here -}

-- | Evaluate a formula, returning either error message or result.
-- This function is derived from 'eval''. It is mostly a helper to for
-- interactive evaluation of formulas.
-- Note: pay attention to instances of 'MonadError', find the one best suited
-- for this.
eval' :: Formula f => f -> Valuation -> Either String Bool
eval' = undefined

-- | Convert a formula to CNF (Conjunctive Normal Form) with linear overhead
-- (i.e. the length of the result must have linear size with respect to the
-- length of the input). Use for example https://en.wikipedia.org/wiki/Tseytin_transformation.
--
-- You should use an appropriate (combination of) monads/monad transformers
-- in the solution. You can expect some points for solution without reasonable
-- use of them, but not too much.
--
-- The runtime of the conversion should be in @O(n log n)@ where @n@ is the
-- length of the input formula (hint: think about @let@ bindings).
--
-- You can assume that the original formula does not contain any variables
-- with @aux@ prefix.
toCNF :: FormulaTree -> CNF
toCNF = undefined

-- | Substitution: given a variable name, a formula which should be substituted
-- instead of it, and a formula to substitute in perform substitution. E.g.
-- (pseudo-code) @substitute "x" (a || b) (y && (x || z))@ ~>
-- @(y && (a || b || z))@
substitute :: Varname -> FormulaTree -> FormulaTree -> FormulaTree
substitute v to (Var x) | x == v    = to
                        | otherwise = Var x
substitute v to (Not f)             = Not $ substitute v to f
substitute v to (BinOp op a b)      = BinOp op (substitute v to a) (substitute v to b)
substitute v to (Let x f into)
                        | x == v    = Let x (substitute v to f) into
                        | otherwise = Let x (substitute v to f) (substitute v to into)

-- | Things which have nice human-readable representation.
class Pretty a where
    pprint :: a -> String

instance Pretty Valuation where
    pprint = ('{':) . (++ "}") . intercalate ", " . fmap single . M.toList . getValuation
      where
        single (var, val) = show var ++ " -> " ++ show val
-- | Human-readable representation of binary operators.
instance Pretty Op where
    pprint And     = "&&"
    pprint Or      = "||"
    pprint Implies = "=>"
    pprint Equiv   = "<=>"

-- | A pretty-printer for formulas, the scope of @let-in@ expression is as in
-- Haskell.
-- Associative operations (||, &&) are printed without unnecessary braces, e.g.
-- @(a || b || c) && d && (e || f)@ instead of @((a || (b || c)) && d) && (e || f)@.
instance Pretty FormulaTree where
    pprint :: FormulaTree -> String
    pprint = pprint' (const id)
      where
        brace :: Maybe Op -> String -> String
        brace _ x = "(" ++ x ++ ")"

        braceOther :: Op -> Maybe Op -> String -> String
        braceOther Implies b x              = brace b x
        braceOther Equiv   b x              = brace b x
        braceOther a b x | Just a == b    = x
                         | otherwise      = brace b x

        pprint' :: (Maybe Op -> String -> String) -> FormulaTree -> String
        pprint' _ (Var x) = x
        pprint' _ (Not x) = "!" ++ pprint' brace x
        pprint' br (Let v what into) = br Nothing $
                   "let " ++ v ++ " = " ++ pprint what ++ " in " ++ pprint into
        pprint' br (BinOp op l r) = br (Just op) $
                    pprint' (braceOther op) l ++ " " ++ pprint op ++ " "
                                              ++ pprint' (braceOther op) r

instance Pretty CNF where
    pprint = pprint . fromCNF

-- | Convert a 'CNF' formula to 'FormulaTree' representation of the CNF.
-- For formulas with empty clauses and for empty formulas, this transformation
-- can introduce a new variable.
fromCNF :: CNF -> FormulaTree
fromCNF (CNF [])  = BinOp Or (Var "p") (Not (Var "p"))
fromCNF (CNF cls) = foldl1 (BinOp And) $ concatMap fromClause cls
  where
    fromClause :: Clause -> [FormulaTree]
    fromClause (Clause []) = [Var "p", Not (Var "p")]
    fromClause (Clause lts) = [foldl1 (BinOp Or) $ map fromLit lts]

    fromLit (Neg x) = Not (Var x)
    fromLit (Pos x) = Var x

-------------------------------------------------------------------------------
-- QuickCheck tests -----------------------------------------------------------
-------------------------------------------------------------------------------

-- | Get variable names for use in QuickCheck-generated formulas.
varsForGen :: [Varname]
varsForGen = ("x" ++) . show <$> [0 :: Int .. 5]

instance Arbitrary Op where
    arbitrary = elements [toEnum 0 ..]
    shrink op = init [toEnum 0 .. op]

instance Arbitrary FormulaTree where
    arbitrary = sized genFormula
      where
        genFormula 0 = Var <$> elements varsForGen
        genFormula n = frequency [ (1, genFormula 0)
                                 , (1, Not <$> genFormula (n - 1))
                                 , (3, BinOp <$> arbitrary <*> halfGen <*> halfGen)
                                 , (1, Let <$> elements varsForGen <*> halfGen <*> halfGen)
                                 ]
          where halfGen = genFormula (n `div` 2)

    shrink (Var x)        = Var <$> takeWhile (< x) varsForGen
    shrink (Not x)        = x : fmap Not (shrink x)
    shrink (BinOp op a b) = [a, b] ++ [BinOp o a b | o <- shrink op]
    shrink (Let v f into) = [f, into, substitute v f into]
                            ++ [Let v f' into | f' <- shrink f]
                            ++ [Let v f into' | into' <- shrink into]

-- | Give a random valuation for variables used in the given formula.
arbitraryValuation :: Formula f => f -> Gen Valuation
arbitraryValuation f = foldMapM (\v -> Valuation . M.singleton v <$> arbitrary) vars
  where
    vars = S.toList $ getVariables f

va, vb, vc, vx, vy :: FormulaTree
va = Var "a"
vb = Var "b"
vc = Var "c"
vx = Var "x"
vy = Var "y"

band, bor, bimpl, beq :: FormulaTree -> FormulaTree -> FormulaTree
band = BinOp And
bor = BinOp Or
bimpl = BinOp Implies
beq = BinOp Equiv

-- | Create a unitTest test from a list of singleton properties.
-- The properties should not use any generator as the unitTest test will run only once.
unitTest :: [Property] -> Property
unitTest = once . foldr1 (.&&.)

-- | A simple unitTest test for pprint
prop_pprintFormulaTree :: Property
prop_pprintFormulaTree = unitTest
    [ pprint vx === "x"
    , pprint (Not vx) === "!x"
    , pprint (vx `band` Not vy) === "x && !y"
    , pprint (Let "x" (va `bimpl` vb) (Not vx `band` vc))
      === "let x = a => b in !x && c"
    , pprint (va `band` vb `band` vc) === "a && b && c"
    , pprint ((va `bor` vb `bor` vc) `band` (vx `bor` va) `band` (vy `bor` vc))
      === "(a || b || c) && (x || a) && (y || c)"
    , pprint ((va `bor` vb `bor` vc) `band` (Let "x" va (vx `bor` vx `bor` vy) `band` (vy `bor` vc)))
      === "(a || b || c) && (let x = a in x || x || y) && (y || c)"
    , pprint ((va `bimpl` vb) `bimpl` vc) === "(a => b) => c"
    , pprint (va `bimpl` (vb `bimpl` vc)) === "a => (b => c)"
    , pprint ((va `beq` vb) `beq` vc) === "(a <=> b) <=> c"
    , pprint (va `beq` (vb `beq` vc)) === "a <=> (b <=> c)"
    ]

-- | A unitTest test for 'getVariables' of 'FormulaTree'
prop_getVariablesFormulaTree :: Property
prop_getVariablesFormulaTree = unitTest
    [ getVariables vx === S.fromList ["x"]
    , getVariables (Not vx) === S.fromList ["x"]
    , getVariables (vx `band` vy) === S.fromList ["x", "y"]
    , getVariables (Let "x" (va `band` vb) (vc `bimpl` vx))
      === S.fromList ["a", "b", "c"] -- x is not free variable here
    , getVariables (Let "x" (va `band` vx) (vc `bimpl` vx))
      === S.fromList ["a", "x", "c"] -- x is free variable here (in the let-binding)
    ]

-- | Add valuation information to the counterexample.
ceval :: Testable p => (Valuation -> p) -> Valuation -> Property
ceval prop val = counterexample ("val: " ++ pprint val) (prop val)

-- | A handler which shows any errors thrown in 'ExceptT' as counterexamples.
-- The property is run in error + valuation reader monad, while the result is
-- valuation reader monad.
testHandler :: Testable p => ExceptT String ((->) Valuation) p -> Valuation -> Property
testHandler prop val = case runExceptT prop val of
                          Left err -> counterexample err (property False)
                          Right p -> property p

-- | Logical implication.
imples :: Bool -> Bool -> Bool
imples a b = not a || b

-- | A monadic version of 'any'.
anyM :: Monad m => (a -> m Bool) -> [a] -> m Bool
anyM p xs = or <$> mapM p xs

-- | If a CNF formula evaluates to @True@ with some random valuation, the
-- original formula must evaluate to @True@ with the same valuation.
-- Note that we use @Blind@ to prevent QuickCheck from printing the formula
-- with @Show@ and print it manually with 'pprint' instead.
prop_cnf_implies_orig :: Blind FormulaTree -> Property
prop_cnf_implies_orig (Blind f) =
    counterexample ("f:   " ++ pprint f) $
    counterexample ("cnf: " ++ pprint cnf) $
    forAllBlind (arbitraryValuation (BinOp And (fromCNF cnf) f)) $ ceval . testHandler $ do
        -- note: reader of valuation + error
        direct <- eval f
        using_cnf <- eval cnf
        pure $ using_cnf `imples` direct
  where
    cnf = toCNF f

-- | If the original formula evaluates to @True@ with some random valuation,
-- there must exist valuation of the auxiliary variables used in CNF
-- transformation such that the CNF formula evaluates to @True@ with the
-- combination of these valuations. We just try all the possible valuations of
-- aux variables here (therefore we discard tests which use too many of them).
prop_orig_implies_Ecnf :: Blind FormulaTree -> Property
prop_orig_implies_Ecnf (Blind f) =
    counterexample ("f:   " ++ pprint f) $
    counterexample ("cnf: " ++ pprint cnf) $
    forAllBlind (arbitraryValuation f) $ ceval . testHandler $ do
        -- note: reader of valuation + error
        let auxs = S.toList $ getVariables cnf S.\\ getVariables f
        when (length auxs >= 10) discard
        direct <- eval f
        auxvals <- assignments auxs
        using_cnf <- anyM (\auxval -> local (const auxval) $ eval cnf) auxvals
        pure $ direct `imples` using_cnf

  where
    cnf = toCNF f
    assignments :: MonadReader Valuation m => [Varname] -> m [Valuation]
    assignments []     = ask <&> (:[])
    assignments (x:xs) = assignments xs <&> \rest ->
                         map (insert x True) rest ++ map (insert x False) rest


pure [] -- enable the following template Haskell magic

-- | Run all tests named @prop_*@. Hint: install ghcid (either from your distro,
-- or using cabal) and run @ghcid -c 'ghci -Wall' -T testAll HW05.hs@.
-- It will show compiler errors and warnings if there are any, and test results
-- otherwise, automatically rerunning it if you change this source.
testAll :: IO Bool
testAll = $forAllProperties $ quickCheckWithResult stdArgs { maxSuccess = 1000 }