-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resource busy error with new async runtime #903
Comments
Hi! Is the source code for you site public? It's be nice to get a reproducible example of some sort. |
Hi Minoru, i can't show the full source, because i have some closed-source code in it, but i was able to reduce it to a minimal stack project: https://github.com/malteneuss/hakyll-async-error -- Main.hs
import Hakyll
import Data.Aeson
main :: IO ()
main = do
hakyll $ do
match "content/siteConfig.yaml" $ compile (getResourceString :: Compiler (Item String))
match "content/post/**" $ do
route idRoute
compile $ do
siteConfig <- load "content/siteConfig.yaml" :: Compiler (Item String)
siteConfigJson <- getMetadata (itemIdentifier siteConfig)
-- decode Json and process...
makeItem ("" ::String) and ---
title: Some title
--- The post markdown files are empty for this example, and the number of threads are ghc-options: -threaded -rtsopts -with-rtsopts=-N4 The error occurs when running |
Excellent! Thank you very much, this should be enough for me to reproduce this. I should probably have time for this in a few days. |
Thanks for an easily reproducible example! I just copied contents/post/1.md a few more times and ran the following loop:
This usually fails on the second iteration :) So, hakyll/lib/Hakyll/Core/Store.hs Lines 130 to 134 in f64d6df
The problem here is the We could protect all I/O in Any ideas? /cc @LaurentRDC |
I'll think about it, but knee-jerk reaction: how do we feel about keeping the store in-memory and only write once at the end of the run? |
Unfortunately, the workaround from OP does not work for me. I also tried passing the flag via CLI flags ( For now, I am disabling |
I ran @Minoru 's script but cannot reproduce the bug on Windows. I tried:
I looped for several minutes without issues. Edit: Oops, needed to create multiple posts. Can reproduce now |
@LaurentRDC Not sure how I feel about that option. The cache only stores metadata and snapshots; I wonder how large this can get? @gwern, I know you have a large Hakyll site — can you please tell us how large is your Also, the cache could be disabled. I searched GitHub for "inMemoryCache language:haskell", found 20 Hakyll sites, one of them disabled the cache, another one did it conditionally, the other 18 enabled the cache. The current default is "enabled". So it's viable to drop the setting, but I'd prefer to avoid it.
Did you also make sure to create additional posts in contents/posts/? How many? I'm on Linux by the way. I'd be surprised if this bug is platform-dependent! @jtojnar Looking at the code you've shared on IRC yesterday (https://github.com/jtojnar/hakyll/blob/72982e1a64f1b5a5a3da9621aad919425124d3ec/data/example/site.hs#L78-L84), I don't yet see why |
I can also reproduce @jtojnar 's issue, even without I'm thinking about whether this is a problem with lazy IO. the |
Because the compile time is about an hour now and the symlinking, it's a bit difficult to say what size it is. (Should that include all the symlinked static files/documents?) I can say the fully built and remotely synced directory is 53GB, inclusive of all files. I'm currently on Ubuntu Linux, hakyll-4.13.4.1, so I would not have observed any new issues in 4.15.x. |
I've tried to enforce strictness as much as possible using BangPatterns, with no luck. I'm wondering if this due to the very small posts that are being processed. What is the timescale for file locks to lock/unlock? This reminds me of #783 , where tests were running so fast that the filesystem couldn't keep up. The solution then was to add a retry mechanism. With this in mind, I've added a retry to from: -- Hakyll\Core\Store.hs
withStore :: Store -> String -> (String -> FilePath -> IO a) -> [String] -> IO a
withStore store loc run identifier = modifyIOError handle $ run key path
where
key = hash identifier
path = storeDirectory store </> key
handle e = e `ioeSetFileName` (path ++ " for " ++ intercalate "/" identifier)
`ioeSetLocation` ("Store." ++ loc)
to -- Hakyll\Core\Store.hs
import Hakyll.Core.Util.File (retryWithDelay)
withStore :: Store -> String -> (String -> FilePath -> IO a) -> [String] -> IO a
withStore store loc run identifier = modifyIOError handle $ retryWithDelay 5 $ run key path
where
key = hash identifier
path = storeDirectory store </> key
handle e = e `ioeSetFileName` (path ++ " for " ++ intercalate "/" identifier)
`ioeSetLocation` ("Store." ++ loc)
which adds a delay of 0.1ms when an exception is thrown, and retries at most 5times. This isn't great, but I don't understand what could cause this problem otherwise |
My actual posts files are all 10-20 KB so quite small, yes.
Btw, i added a second post to the error project (that somehow was not commited before) |
@gwern I think 53G is for _site, but what I'm interested in is _cache (or whatever name you specified via
I didn't look at the code, but I expect that it's just a mutex in RTS. If that's the case, it would take from hundreds of nanoseconds to small digits of microseconds. I'm pretty sure this would be dwarfed by opening and closing the file (even if we wrote nothing to it), because that would involve a switch into the kernel.
All these retries are wasteful, so I tried a slightly different approach: diff --git a/lib/Hakyll/Core/Store.hs b/lib/Hakyll/Core/Store.hs
index da16c6f..0c07e36 100644
--- a/lib/Hakyll/Core/Store.hs
+++ b/lib/Hakyll/Core/Store.hs
@@ -17,17 +17,19 @@ module Hakyll.Core.Store
--------------------------------------------------------------------------------
import qualified Data.Hashable as DH
-import Data.Binary (Binary, decode, encodeFile)
+import Data.Binary (Binary, decode, encode)
import qualified Data.ByteString.Lazy as BL
import qualified Data.Cache.LRU.IO as Lru
import Data.List (intercalate)
import Data.Maybe (isJust)
import Data.Typeable (TypeRep, Typeable, cast, typeOf)
+import Control.Monad (when)
import System.Directory (createDirectoryIfMissing, doesFileExist, removeFile)
import System.FilePath ((</>))
import System.IO (IOMode (..), hClose, openFile)
import System.IO.Error (catchIOError, ioeSetFileName,
- ioeSetLocation, modifyIOError)
+ ioeSetLocation, modifyIOError,
+ tryIOError, isAlreadyInUseError)
--------------------------------------------------------------------------------
@@ -127,33 +129,48 @@ cacheDelete (Store _ (Just lru)) key = do
--------------------------------------------------------------------------------
-- | Store an item
-set :: (Binary a, Typeable a) => Store -> [String] -> a -> IO ()
-set store identifier value = withStore store "set" (\key path -> do
- encodeFile path value
- cacheInsert store key value
- ) identifier
+set :: forall a. (Binary a, Typeable a) => Store -> [String] -> a -> IO ()
+set store identifier value =
+ withStore store "set" (\key path -> do
+ let encoded_value = encode value
+ e1 <- tryIOError (trySet key path encoded_value)
+ case e1 of
+ Left e2 | isAlreadyInUseError e2 -> do
+ cached :: Result a <- cacheLookup store key
+ let encoded_cached = fmap encode $ toMaybe cached
+ when (encoded_cached /= Just encoded_value) $
+ trySet key path encoded_value
+ Left e2 -> ioError e2
+ Right result -> return result
+ ) identifier
+ where
+ trySet key path encoded_value = do
+ cacheInsert store key value
+ BL.writeFile path encoded_value
--------------------------------------------------------------------------------
-- | Load an item
get :: (Binary a, Typeable a) => Store -> [String] -> IO (Result a)
-get store = withStore store "get" $ \key path -> do
- -- First check the in-memory map
- ref <- cacheLookup store key
- case ref of
- -- Not found in the map, try the filesystem
- NotFound -> do
- exists <- doesFileExist path
- if not exists
- -- Not found in the filesystem either
- then return NotFound
- -- Found in the filesystem
- else do
- v <- decodeClose path
- cacheInsert store key v
- return $ Found v
- -- Found in the in-memory map (or wrong type), just return
- s -> return s
+get store idenfifier =
+ retryIfAlreadyInUse $ withStore store "get" (\key path -> do
+ -- First check the in-memory map
+ ref <- cacheLookup store key
+ case ref of
+ -- Not found in the map, try the filesystem
+ NotFound -> do
+ exists <- doesFileExist path
+ if not exists
+ -- Not found in the filesystem either
+ then return NotFound
+ -- Found in the filesystem
+ else do
+ v <- decodeClose path
+ cacheInsert store key v
+ return $ Found v
+ -- Found in the in-memory map (or wrong type), just return
+ s -> return s
+ ) idenfifier
where
-- 'decodeFile' from Data.Binary which closes the file ASAP
decodeClose path = do
@@ -162,6 +179,14 @@ get store = withStore store "get" $ \key path -> do
BL.length lbs `seq` hClose h
return $ decode lbs
+ retryIfAlreadyInUse :: IO a -> IO a
+ retryIfAlreadyInUse action = do
+ e1 <- tryIOError action
+ case e1 of
+ Left e2 | isAlreadyInUseError e2 -> action
+ Left e2 -> ioError e2
+ Right result -> return result
+
--------------------------------------------------------------------------------
-- | Strict function Note that I put the new entry into LRU before I write it to disk. Anyone who gets a "resource busy" error retries starting with LRU. If they intended to write the same value as is already in LRU, then they consider the job done; otherwise they try to write one more time, and fail then. This fixes the code from OP, but would fail if there were two cache entries that cause conflicts. That's the same problem as in your solution; we have to allow infinite number of retries. I also tried @jtojnar's code, and I do get errors, but never with
and
I haven't dug into them yet, but the core problem is the same: I'll try rwlock sometime this week, and report back. |
601MB. |
I has this bug again. Using hakyll 4.15.1.1, I get
every time I change the content of This bug occurs when I want to produce two version of a file, one page and one raw, as in the tutorial. This is part of my match "posts/*" $ version "raw" $ do
route idRoute
compile getResourceBody
match "posts/*.md" $ do
route $ setExtension "html"
compile $ pandocCompiler'
>>= loadAndApplyTemplate "templates/post.html" postCtx
>>= loadAndApplyTemplate "templates/default.html" postCtx
-- >>= relativizeUrls (where |
This is fixed in 4.16.0.0. |
After upgrading to the newest Hakyll version 4.15.1.0 with async runtime, i get the following error:
This file in question is a yaml file containing site-wide data such as the webpage title, and is loaded from every page. There exists no
siteConfig.yaml.metadata
file.ps: I can work around this issue by falling back to a single thread with the usual cabal settings:
executables site ... ghc-options: -threaded -rtsopts -with-rtsopts=-N1
The text was updated successfully, but these errors were encountered: