Class SwitchingOutputStream

  • All Implemented Interfaces:
    Closeable, Flushable, AutoCloseable

    public class SwitchingOutputStream
    extends OutputStream
    This class is an OutputStream that stores to memory or file (depending on the data size). If the size is known in advance, the storage will be set when creating the output stream. Otherwise it starts with memory storage and switches to file storage (keeping the already written data) when exceeding the threshold.
    After the output stream has been closed, the data can be retrieved as byte[] or Path depending on the used storage. Additionally, the amount of written bytes is tracked as well as the SHA-512 hash.

    As post-mortem clean-up (and when closing), this output stream just closes the FileOutputStream if one is used. The corresponding file is not deleted! This has to be ensured by the creator of this stream.

    This class is not thread-safe, so use it only from the very same thread.

    This class is inspired by DeferredFileOutputStream.

    See Also:
    StreamTools.getMemStreamThreshold()
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected Cleanup<IOException> cleanup
      The clean-up called as post-mortem action of this SwitchingOutputStream which closes the currentOutputStream if present.
      protected org.apache.commons.lang3.mutable.Mutable<DigestOutputStream> currentOutputStream
      The output stream to which data will be written at any given time.
      protected Path directory
      The directory to use for the temporary file.
      protected MessageDigest md
      The message digest creating the SHA-512 hash for the written data.
      protected String prefix
      The temporary file prefix.
      protected long threshold
      The threshold after which to switch from memory to file data.
    • Constructor Summary

      Constructors 
      Constructor Description
      SwitchingOutputStream​(long size, Path directory, String prefix)
      Creates a new SwitchingOutputStream accepting data and storing it either in memory or in a file.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected static int calculateThreshold​(long size)
      Gets the threshold for switching from memory to file storage based on the designated estimated size of the stream.
      protected void checkThreshold​(long count)
      Checks whether writing the designated amount of bytes will exceed the threshold and if so, switches to file storage.
      void close()  
      protected boolean exceedsThreshold​(long count)
      Gets whether writing the designated amount of bytes will exceed the threshold or whether we have already decided for file storage.
      void flush()  
      long getByteCount()
      Gets the number of bytes written to this output stream.
      byte[] getBytes()
      Gets the content of this output streams as byte[] if the threshold has not been exceeded.
      Path getPath()
      Gets the content of this output streams as Path if the threshold has been exceeded.
      byte[] getSHA512Hash()
      Gets the SHA-512 hash of the content written to this output stream or null if the output stream has not been closed yet.
      protected void switchToFileStorage()
      Switches to file storage since the threshold has been reached.
      void write​(byte[] b)
      Writes b.length bytes from the specified byte array to this output stream.
      void write​(byte[] b, int off, int len)
      Writes len bytes from the specified byte array starting at offset off to this output stream.
      void write​(int b)
      Writes the specified byte to this output stream.
    • Field Detail

      • threshold

        protected final long threshold
        The threshold after which to switch from memory to file data. -1 is for permanent memory storage, 0 is for a permanent file storage, since we have a size estimate so we can determine the storage a priori.
      • md

        protected final MessageDigest md
        The message digest creating the SHA-512 hash for the written data.
      • directory

        protected final Path directory
        The directory to use for the temporary file.
      • prefix

        protected final String prefix
        The temporary file prefix.
      • currentOutputStream

        protected final org.apache.commons.lang3.mutable.Mutable<DigestOutputStream> currentOutputStream
        The output stream to which data will be written at any given time.
    • Constructor Detail

      • SwitchingOutputStream

        public SwitchingOutputStream​(long size,
                                     Path directory,
                                     String prefix)
                              throws IOException
        Creates a new SwitchingOutputStream accepting data and storing it either in memory or in a file. The file will be in the designated directory and have the designated prefix. Which storage is chosen depends on the written data or on the designated size (if known) and the threshold. If a valid size (>= 0) is provided the storage may be used right from the beginning. Otherwise storage may change when writing the data.

        The caller has to make sure to delete the file (when stored to file) after it is no longer required. This also includes clean-up in case of failure.

        Parameters:
        size - The estimated size of (the contents of) the created stream. Use a negative value if the size is unknown.
        directory - The directory to use for temporary file.
        prefix - The temporary file prefix.
        Throws:
        IOException - If there are problems retrieving the SHA-512 digest or file storage is chosen and creating one of the corresponding directories or the file itself fails, an IOException will be thrown.
    • Method Detail

      • getBytes

        public byte[] getBytes()
                        throws IOException
        Gets the content of this output streams as byte[] if the threshold has not been exceeded. Otherwise null will be returned. This requires the stream to be closed.
        Returns:
        The content of this output streams as byte[] if the threshold has not been exceeded, null otherwise.
        Throws:
        IOException - If this output stream has not been closed yet, an IOException will be thrown.
      • getPath

        public Path getPath()
                     throws IOException
        Gets the content of this output streams as Path if the threshold has been exceeded. Otherwise null will be returned. This requires the stream to be closed.
        Returns:
        The content of this output streams as Path if the threshold has been exceeded, null otherwise.
        Throws:
        IOException - If this output stream has not been closed yet, an IOException will be thrown.
      • getByteCount

        public long getByteCount()
        Gets the number of bytes written to this output stream. This does not require the stream to be closed.
        Returns:
        The number of bytes written to this output stream.
      • getSHA512Hash

        public byte[] getSHA512Hash()
        Gets the SHA-512 hash of the content written to this output stream or null if the output stream has not been closed yet.
        Returns:
        The SHA-512 hash of the content written to this output stream or null if the output stream has not been closed yet.
      • calculateThreshold

        protected static int calculateThreshold​(long size)
        Gets the threshold for switching from memory to file storage based on the designated estimated size of the stream. If the designated size is negative, it is treated as unknown. In this case the dynamic threshold will be the StreamTools.getMemStreamThreshold(). In case the size is known, it is compared against the designated total memory percentage. If the size is bigger, the threshold will be 0 otherwise it will be -1. 0 uses file storage right from the beginning, -1 uses memory storage and does not check the threshold again.
        Parameters:
        size - The estimated size of (the contents of) the created stream. Use a negative value if the size is unknown.
        Returns:
        The threshold to use for a stream of the designated size allowed. 0 for always file storage, -1 for always memory storage and another positive value for when to switch from memory to file storage dynamically.
      • checkThreshold

        protected void checkThreshold​(long count)
                               throws IOException
        Checks whether writing the designated amount of bytes will exceed the threshold and if so, switches to file storage. This will respect the threshold that fixed the storage in the constructor. If no output stream exists yet, the one depending on the threshold and the designated count will be created.
        Parameters:
        count - The amount of bytes that are to be written.
        Throws:
        IOException - If this output stream has already been closed or creating the file for storing or one of its directories fails, an IOException will be thrown.
      • exceedsThreshold

        protected boolean exceedsThreshold​(long count)
        Gets whether writing the designated amount of bytes will exceed the threshold or whether we have already decided for file storage.
        Parameters:
        count - The amount of bytes that are to be written.
        Returns:
        Whether writing the designated amount of bytes will exceed the threshold or whether we have already decided for file storage.
      • switchToFileStorage

        protected void switchToFileStorage()
                                    throws IOException
        Switches to file storage since the threshold has been reached. This will create a new file and copy the present data to this file before writing new data.
        Throws:
        IOException - If there are problems creating one of the corresponding directories or the file itself fails, an IOException will be thrown.
      • write

        public void write​(byte[] b)
                   throws IOException
        Writes b.length bytes from the specified byte array to this output stream.
        Overrides:
        write in class OutputStream
        Parameters:
        b - The array of bytes to be written.
        Throws:
        IOException - if an error occurs.
      • write

        public void write​(byte[] b,
                          int off,
                          int len)
                   throws IOException
        Writes len bytes from the specified byte array starting at offset off to this output stream.
        Overrides:
        write in class OutputStream
        Parameters:
        b - The byte array from which the data will be written.
        off - The start offset in the byte array.
        len - The number of bytes to write.
        Throws:
        IOException - if an error occurs.
      • write

        public void write​(int b)
                   throws IOException
        Writes the specified byte to this output stream.
        Specified by:
        write in class OutputStream
        Parameters:
        b - The byte to be written.
        Throws:
        IOException - if an error occurs.