***** SECONDO NEWS ***** This file is a replacement of the secondo-news mailing list. Please add here interesting information which should be kept for future SECONDO users. Just add new messages directly below this text with a new header. 2011-08-17 Preparations for relase of Secondo 3.1.1 =============================================================================== This is just a summary of the most important changes. Please read the ReleaseInfo to learn more details on changes: * Harmonized naming schema for moving and unit types * Renamed operators with typical attribute names to avoid problems in queries * Plugins NearestNeighbor, TBTree, and STPattern have been merged into the standard algebra collection. * Operators have been added to several Algebras * TypeMappings have been changed in order to foster the use of generic TypeMapping tools and new Attribute member functions returning the typename * Support for geographic coordinates has been integrated or prepared for many spatial and spatio-temporal operators. * Several bug fixes 2010-06-21 Preparations for release of Secondo 3.0 =============================================================================== Several changes have been done in the last 6 monthes. * The Flob concept has been totally re-implemented to avoid the nasty errors created by the old Flob-Cache. * For spatial and spatiotemporal datatypes with set semantics, we now differentiate between EMPTY and UNDEFINED values. * Several changes have been done to make the SMI code compatible with different versions of 3rd party software, namely BerkeleyDB, thus increasing compatibility with different platforms. * Many bugfixes regarding system stability: Memory holes have been fixed, some operator implementations corrected. * New support structure "TupleFile": This type can be used by algorithms that need to materialize data. Data is stored in flat files rather than in temporal relations. Also, only data not fitting into the main memory buffer gets materialized on harddisc. Should be used as an replacement for the "TupleBuffer". * Changed implementations/ New algebra: The ExtRelation-2Algebra provides external Algorithms for sorting and different join algorithms. Also, sorting is now done by a paramerizable multi-stage mergesort, with restricted amount of main memory. The new algorithms uses TupleFiles instead of the old TupleBuffer. Most according original algorithms from the ExtRelationAlgebra have been replaced by the operators from this new algebra. * New operators in the RTreeAlgebra allow for query-based inspection of the tree structures. * New modules: The BTree2Algebra provides parametrizable BTrees. The RTreeViewer allows for visualized online-exploration of Rtree objects. * Optimizer: Exception handling was extended so that now most errors can be caught and reported to the user. Scripts for executing the BerlinMOD/R benchmark from the optimizer have been added to the Optimizer directory. 2009-03-02 Changes in the SpatialAlgebra =============================================================================== Since the defined flag is already included in StandardAttribute, I have removed the additional one from class Point. This is important because class Point will be used frequently and the change reduces its size. Sorry, but once again you have to restore your databases! Regards Markus 2008-11-14 Changes of TemplateClass Rectangle =============================================================================== Since the defined flag is already included in StandardAttribute, I have removed the additional one from the implementation. Once again you have to restore your databases! Regards Christian 2008-10-27 Changes of the Tuple's Block Layout =============================================================================== In order to allow direct access to attributes wihtout unpacking all other attribute data the block layout of tuple records has been changed. A special relation iterator which utilizes this feature will follow soon. Once again you have to restore your databases! Regards Markus 2008-10-20 Changes for class FLOB =============================================================================== In order to save disk space, the FLOB class has been changed. It contains now only a pointer to its meta data which will be restored when loaded from disk. Sorry, but again you have to restore your databases! Regards Markus 2008-08-29 Changes at the DateTime class =============================================================================== In order to save disk space, the DateTime class has been changed. Type and defined flag are now coded within an single character. For this reason, you have to restore all your databases :-( Regards Thomas 2008-08-22 Optional Attribute Datatype Serialization =============================================================================== Now it's possible to implement functions for attribute data types which manage the storage to a memory block and the reinitialization from a memory block. This makes it possible to save disk space, since the default block storage mechanism is not space efficient. There are example implementations for int, real and string. Currently, this is work in progress and can be deactivated by undefining the compile flag USE_SERIALIZATION in makefile.options Documentation: Attribute.h StandardTypes.h Note: The code changes require to do a make clean, to rebuild SECONDO and to restore your databases :-( Regards Markus 2008-08-07 Word changed =============================================================================== To save memory usage, the struct Word was changed to be a variant. Because of this change, you have to restore all your databases. Thomas 2008-07-15 Added new members to template class R_Tree =============================================================================== In order to support Angelika Braese's implementation of the NearestNeighborAlgebra, class R_Tree was extended with additional private attributes and public functions. Therefore YOU NEED TO RESTORE ALL YOUR DATABASES CONTAINING RTREES! Christian 2007-05-30 lrsArray removed from the line type =============================================================================== In order to make the line type more simple, the lrsArray and further members has been removed from the Line class. Thereby, some functions are not longer available. For compensating that, a new type sline with a corresponding class SimpleLine has been introduced. A Simple line represents a simple polyline (i.e. with at most one component and without any branches). There are some functions (and operators) for converting between the types provided. By the changes at the line type: YOU NEED TO RESTORE ALL YOUR DATABASES CONTAINING LINES! Have fun! Thomas 2007-05-30 Added Generic Open and Save Functions for Attribute Types =============================================================================== We added Template Functions OpenAttribute and SaveAttribute to file "Attribute.h". You can use these functions in TypeConstructors for attribute types, so that you no longer need to implement them yourself. We used this method to provide OPEN and SAVE methods to all MAPPING and some further types. Therefore YOU NEED TO RESTORE ALL YOUR DATABASES! Enjoy! Thomas & Christian 2007-09-10 Changes in representation of type movingregion and uregion =============================================================================== In order to establish a proper use of the defined flag within mapping types, I needed to change the representations of datatypes movingregion and uregion. You need to restore any database containing objects of these types! Christian 2007-06-06 First changes for Linux x86_64 systems =============================================================================== We have managed to install and compile secondo on a Linux 64 bit system. The following problems and limitations still arise: - Some tests fail due to floating point precision errors - Some tests,e.g those for operator tuplesize, need to provide new platform dependent results. - Some system constants defined in limits.h INT_MAX and LONG_MAX may exceed numbers representable in nested lists. - The Jpl directory does not compile However, but all operations work without system crashes! Needed SDK-Changes: Berkeley-DB 4.2.52 must be replaced by version 4.3.29 since the older one does not compile on a x86_64 platform. Best Regards Markus 2007-04-24 Changes in the RelationAlgebra =============================================================================== The base class GenericRelation was revised in order to make it compatible with class TupleBuffer, e.g. some member functions of class Relation were declared as virtual functions in class GenericRelation. Moreover, the TupleBuffer was made available as secondo type trel. You can create a trel by consuming a stream(tuple(..)) using operator tconsume, e.g. > plz feed tconsume; -------------------- Currently this type is intended to be used only for temporary results, the save and open function are not implemented. Since the TupleBuffer creates its Berkeley-DB files in a separate directory without transaction control you can save many megabytes of log files if you run queries with a quite big temporary relation as result. Moreover, the TupleBuffer now ignores to copy persistent LOBs (LOBs which are stored on disk). This saves again much processing time since LOBs on disk have their own lob-file and record-id which are stable during the query. Only LOBs which are created in the query itself need to be written to the lob-file of temporary result relations. Best Regards Markus 2007-04-19 Revised Organization of Tuples =============================================================================== Due to inconsistent management of allocated management and some contradictions in the concept of fresh and solid tuples we changed the concept and implementation of the tuple representation. Now tuples are always handled like the former "fresh" tuples and writing them to disk has no effect for their current memory organization. The folowing major changes happend: (1) The solid state of tuples is removed, thus tuples are stateless now. (3) The FLOB class has been revised. (4) The DBArray class has a new function "TrimToSize", which resizes the array and the underlying FLOB to hold exactly the number of elements which are stored in it. (5) Some bug fixes in operator implementations concerning reference counting. Moreover, the optional "Main Memory" relational algebra implementation files are removed. If make fails, please try "make clean; make". Afterwards you should rebuild your databases. Best Regards Thomas and Markus 2007-04-03 New dependency =============================================================================== I have implemented and added a new algbera module, the 'GSLAlgebra'. It uses functions imported from the GNU Scientific Library (GSL) and therefore depends on this library. I have included the GSL into the installsdk script and copied the gsl sources/binaries into the gnu-folders of the SECONDO_SDK_INSTALLATION_KIT. If you get problems compiling Secondo, please install GSL 1.8 manually (e.g. using yast) or by getting the recent SECONDO_SDK_INSTALLATION_KIT and re-running the 'installsdk' script. Christian 2007-01-02 R-Trees =============================================================================== I have implemented bulkloading for RTrees. Also, using the nodes(_) operator, you can inspect your RTrees now. As changes were done in the RTree implementation, you may need to restore your RTree-Indices, if you run into problems. Christian 2007-05-01 Example Queries =============================================================================== There are new features for the definition of example queries please study the file Documents/Secondo-Ideas.txt for details. Regards, Markus 2006-28-11 Operator specs =============================================================================== From now on, we have an new mechanism for specifying example queries. In the future it should guarantee that all example queries of the online help will run and produce correct results. Thus it is another way of defining little tests which are suitable for a quick regression test over all algebra modules. The concept is described in the file Documents/Secondo-Ideas.txt. Basically you need to provide an ".examples" file in the algebra module's directory (for an example refer to the StandardAlgebra). At startup for each active algebra the ".examples" file will be read in and the examples are processed by the Secondo-Parser. Errors will be displayed during startup (Currently ther are a lot of them). Template files are generated below bin/tmp. Kind regards, Markus 2006-10-05 Changes in building the OptServer =============================================================================== The JPL library and the Secondo part of the OptServer are divided into different files now. This enables the use of precompiled jpl libraries. For datails see Jpl/readme.txt. On windows platforms this change requires a definition of the variable PL_DLL. Otherwise the OptServer will not compile. It's recommended to define this variable within the ~/.secondo.rc file. Best regards, Thomas 2006-09-13 Correction in the QueryProcessor::Eval function =============================================================================== I have corrected the query processor's Eval function. The error was that we needed to call the Request for "simple" objects in stream operators (operators that do return streams). Now we only call Request when it is really necessary, i.e. for nodes that cannot be previously evaluated by the query processor, e.g. streams and functions. Best regards, Victor 2006-09-11 Support for MAC OSX =============================================================================== I have changed a lot of makefiles and cpp-files in order to make the build process also possible on Mac OSX. It may happen, that now the system does not compile on Windows since I haven't it tested for this platform yet. In case of trouble please send a mail to markus.spiekermann(at)fernuni-hagen.de Regards Markus 2006-08-22 New System Tables =============================================================================== I have implemented two new system tables called SEC_CACHEINFO SEC_FILEINFO They provide information about Berkeley-DB's internal cache usage. For detailed information please refer to the file "CacheInfo.h". 2006-05-08 New Sample and Small Relations =============================================================================== Due to changes for the entropy optimizer, the "_small" relations have a new structure and need to be recreated (if you want to use the entropy optimizer). Probably the easiest way to do this is to restore databases and also to reinitialize the optimizer information ("rm stored*" in the optimizer directory). Sample relations also have changed a week ago and need to be recreated as well. For standard databases this is done automatically. For non-standard databases such as germany, samples and small relations should now be created manually from the optimizer, calling predicates createSamples('Kreis', 100, 50) and createSmall(kreis, 50) (sorry for the different syntax), respectively. Regards Ralf 2006-05-03 Command Times and Counters =============================================================================== Command Times: -------------- The output of the query or command times has been changed. Now also the times for 1) creating the list representation for the result object 2) for committing the transaction (which is reasonable!) 3) for copying the result list (necessary in order to empty the list memory) are shown. If the output is too noisy, change the setting in SecondoConfig.ini in order to suppress them. Note: Some keywords in SecondoConfig.ini were changed. Please update your configuration file by replacing it with SecondoConfig.example. Counters: --------- There are new counters which keep track of bytes read or written to disk. They are implemented in SmiRecord and the PrefetchingIterator. There are counters for 1) The number of function calls 2) The number of transferred bytes 3) The transferred data volume measured in pages Note: The last value is not identical to reading pages from disk. This information is only present in the Berkeley-DB cache. Currently, we have no interface to access the Berkeley-DB cache statistics. Finally, all command times and counter values are stored in the system tables SEC_COMMANDS SEC_COUNTERS which are non persistent relation objects whose values are only kept during the current session. However, you can store them by let sessionCmds = SEC_COMMANDS feed consume; Regards Markus 2006-02-24 Automatic Tests =============================================================================== The TestRunner has new features! Please refer to example.test for its documentation. The following features are new: 1) The expected result of a query can be specified in a separate file, e.g. #yields @resultFile 2) Values of real atoms of the result and the expected result can be compared approximately either by a relative or by a fix tolerance parameter 3) File names can be specified including environment variables of the shell, e.g. #yields @$(HOME)/data/query1.result Moreover, the notation $(VARIABLE) can be used for SECONDO's restore and save commands, but then the file name needs to be a text atom which can easily specified by enclosing it in single quotes, e.g. restore database germany from '$(VARIABLE)/secondo-data/germany'; Note: On windows the path separator must be a backslash, for example restore database germany from 'C:\msys\1.0\home\myname\secondo-data\germany' There should be comprehensive test cases for every algebra. The automated test scripts will run all files below "Tests/Testspecs" which are ending with ".test". Moreover, you need not to set up your own data in a test. The databases defined in "bin/createdb.test" are restored before all other tests. Hence, if you need a specific database of the secondo-data repository, please add it there in the test file you must only open the database. Finally, you can run all tests locally by calling make runtests and a single test can be invoked by TestRunner -i *** Please try to create and maintain test files for *** *** your algebra from the very beginning of its implementation! *** Best regards Markus 2006-02-23 Environment Changes =============================================================================== The changes of 02-20 can have some confusing effects since files which before are created by make are now under CVS control. In order to make sure that everyone has the same configuration please run the following commands: cvs update -dP make update-environment; open a new shell and run cvs update -dP make Sorry for the trouble, Markus 2006-02-20 Changes of building, linking, and starting the applications =============================================================================== The build procedure has been changed. Now only two applications called "SecondoBDB" and "SecondoCS" are compiled. They know the options -pl: Start as SecondoPL -test: Start as TestRunner -srv: Start as Server (only SecondoBDB) Hence we need only to link the algebra libraries into one application instead of many of them. This speeds up linking. For convenicence and backward compatibility there are some shell scripts called bin: SecondoTTYBDB, SecondoTTYCS, TestRunnner, TestRunnerCS, SecondoMonitor Optimizer: SecondoPL, SecondoPLCS Hope it works for all, Markus 2006-01-09 A big change on the kernel of the system has been made =============================================================================== - The algebra levels were removed. Since the descriptive level is implemented in the optimizer, it is not needed anymore. The kernel of the Secondo system now works only in the executable level. The concepts of Models and Costs were also removed from the kernel of the system for the same reasons. - In the relational algebra, a LRU cache for FLOBs is implemented. The main idea behind this cache is to better use the memory inside operators. Before this modification, the FLOB size was taken into cosideration to calculate the size of a tuple in memory. Now, only the attribute and the small FLOBs are considered, which increases the number of tuples that fit in memory considerably. The cache is also important because a pointer to the FLOB memory is returned instead of copy, which can reduce the CPU time. The memory utilization is corrected for some operators. - The concept of free tuples is changed. Now, instead of a boolean value telling whether a tuple is free or non-free, we have an integer number where zero means that the tuple is free for deletion. Whenever it is loaded into memory and we do not want to delete it, this number is increased. When it is unloaded, the number is decreased. With this change we could avoid all calls to the function CloneIfNecessary that is now removed. In fact, we do not clone tuples anymore. - In a lower granularity, we avoid, as much as possible, cloning attributes too. A reference counter is added to the TupleElement class. Every time a tuple needs an attribute from another tuple, it just copies the attribute's pointer and increases the counter. When it wants to delete the attribute, it decreases the counter, which is only deleted when the counter is zero. 2005-11-22 Problems with Javagui using Java Version 1.5 =============================================================================== When you are using Javagui with Sun's Java Version 1.5, the snapshot function of Javagui leads to a hang up of the Java Virtual Machine on linux systems. The error message will be: "Couldn't execl robot child process: Permission denied" To solve this problem, go into the jdk/jre/lib/i386 subfolder of your java installation and change the file "awt_robot" to be executable using the command chmod ugo+x awt_robot Depending on the installation of the java sdk it may be required to do that with root rights. If you don't have root access, install your private java-sdk or ask your administrator. 2005-10-26 Notes for Problems on newer Linux Systems: =============================================================================== 1) Installation problems: ------------------------- Some new linux distributions (e.g. SuSe 9.2) are equipped with bash version 3.0. This causes problems in the installsdk script. For example the configure script of the gcc will break due to compatibility problems of the trap command when the bash is running in posix mode. At the secondo website you can download a newer version of installsdk which solves this problem. 2) Environment problems: ------------------------ Using SuSe 9.2 we observed, that environment changes done in the file ~/.bashrc are permanent for all shells. Hence we recommend only to define an alias, e.g. alias initsecondo="source .secondorc $HOME/secondo" Before compiling SECONDO you have to run this new alias command (only a single time for that shell) initsecondo cd secondo make This has the advantage that the changes in the environment made by the .secondorc file are only local to this shell (and subshells) but not to other shells. This is more secure and prevents to mess up your system by changing important variables like PATH or LD_LIBRARY_PATH. 3) Secondo Server Startup Problems ---------------------------------- There seems to be a problem in retrieving the IP-address for localhost. In the file SecondoConfig.ini the value localhost must be replaced by 127.0.0.1 Best Regards Markus Spiekermann 2005-09-29 =============================================================================== Dear all, currently there is a problem (only MS-WINDOWS) with the jpeg library used by the picture algebra. A file called jpeg62.dll is missing in the Secondo-SDK/bin directory. If you have this problem, please download "jpeg6b-3-bin.zip" from the SECONDO website or disable the picture algebra. Markus 2005-07-22 Makefile Switches ================================================================================ Dear all, I have introduced some new variables which influence the make process: For instance the setttings SECONDO_ACTIVATE_ALL_ALGEBRAS="true" SECONDO_YACC=/usr/bin/bison will compile and link all algebra modules. The systems parser generator must be used otherwise it is not possible to create the Secondo-Parser. On windows we haven't yet a newer version of bison, hence it is not possible to activate all algebras there. It will also only work if there are no empty algebra directories. This can be avoided by using optin -P (prune empty directories). This should always be used by cvs checkout and update commands. With the new switch all subdirectories of Algebras (execept Management) are used. Moreover for every directory a library file "lib/lib.a" may be produced (except NauticalMap since it does not compile). All these files are linked togehter with the applications. Moreover the file AlgebraList.i.cfg may not contain entries which have no algebra directory, hence this will result in an "undefined reference" error when linking all together. Moreover, the libraries are now grouped by the "-( lib1 ... -libN -)" linker command which will automatically resolve circular dependencies among them. However, this was only for your information most of you will not need it. The overnight make run will use it to ensure that all algebras will compile. Bye Markus 2005-07-22 CVS ================================================================================ Dear all, if you need a stable version of SECONDO you can checkout cvs co -d sec-stable -rLAST_STABLE secondo the tag LAST_STABLE will be set by the automatic overnight test if everthing compiles and the Testrunner files return no errors. Something about the update command: --------------------------------------------------------- In general you should use always the option -d, e.g. cvs update -d otherwise you will not get new directories from the server. Sometimes empty directories can cause trouble. In this case use cvs update -P which will remove them. Once you have requested a fixed version by specifying a tag or date -r or -D you will see no future updates. All files in your working copy are marked with a sticky tag. In order to change this behaviour call cvs update -A which will reset them. Bye Markus 2005-07-18 TupleIdentifier Algebra ================================================================================ Folks, I added a new algebra called TupleIdentifier Algebra implemented by Matthias Zielke. I also added a new way of creating B-Trees from streams. Now, the operator createbtree also expects a stream of tuples containing an extra attribute called tid (from the TupleIdentifier algebra). Two operators are provided in the TupleIdentifier algebra for adding such attributes, namely addtupleid and tupleid. One can index a relation now in these ways, for example: let ten_no = ten createbtree[no] // The old way that is still valid let ten_no = ten feed extend[tupleid(.)] createbtree[no] let ten_no = ten feed addtupleid createbtree[no] The motivation behind these changes is that now one can sort a relation before inserting it into a B-Tree, which is much more efficient. An example for that is: let plz_PLZ = plz feed addtupleid sortby[PLZ asc] createbtree[PLZ] The changes are available on our CVS server but if you download the changes please also activate this algebra in the makefile.algebras and AlgebraList.i files. The changes were already made in the makefile.algebras.sample and AlgebraList.i.cfg cvs files. If someone finds any difficulties or any problems, please let me know. []s Victor 2005-07-15 Extensions in the optimzer's information look up =============================================================================== Dear all, there are some extensions in the database dependent information look up, which are needed by the optimizer. Three files, nameley '_sample_j', '_sample_s' and '_small' are created. An index on relation '_small' with name '__small' will be created, if and only if there is an index available for the the pair (,). Gathered information look up are stored in local memory and are available via predicates 'storedX', where 'X' is one of 'Spell', 'Card', 'Sel', 'PET', 'Index', 'TupleSize' or 'Rel'. Please carry out the following steps to keep consistency in your database dependent information. 1. Delete all files in your database with name _sample_j, _sample_s, _sample and _small, if available. 2. Delete or rename all files 'storedXs.pl' (seven files) from your local optimizer directory. 3. Make a simple query for each relation in your database, e.g. (sql) select count(*) from , using SecondoPL for example. For every pair (,) there is information available if an index exists or not. If you add an index '_' to your database simply type 'updateIndex(,)' to inform the optimizer that there is an index for the pair (,) available. Additionally an index of the same type will be created for the relation _small. If u delete an index for the pair (,) type 'updateIndex(,)'. The index for the relation _small will be deleted and the optimizer will be informed that there is no index available anymore. Note, that the updateIndex predicate works like a switch and doesn't check if the pair (,) is really deleted from the database. This is the user's responsibility. If you want to delete a relation from your database type 'updateRel()'. All created files above will be deleted and all information about this relation will be removed from the optimizer's knowledge base. Regards Frank 2005-05-20 Sample Files in the Optimizer, New Relational Object in Secondo-Data ================================================================================ Dear all, i've made some changes in the optimizer module, namely changes in the files 'statistics.pl' and 'database.pl.' For computing a better selectivity, specially for selection predicates, there are now two different sample files available. The file '_sample_s' will be used for selection predicates and the file '_sample_j' for join predicates. For a proper work with this new feature it is necessary, that you delete the following files from your 'Optimizer' directory: 'storedSels.pl', 'storedSpells.pl', 'storedCards.pl', 'storedRels.pl', 'storedTupleSizes.pl', 'storedIndexes.pl'. Furthermore you can delete the the old sample files _sample from your databases. There is a new relational object, called 'telefon', available in the CVS module 'secondo-data'. You will find the file in the directory 'Objects/Telefon97'. 'telefon' contains 31.499.800 tuples with address and telephone entries from Germany. These data are only for internal use, because the data isn't freeware. If you want to restore this object in a database, make 'cvs update -d' in your 'secondo-data' directory and follow the instructions from the README file. Regards Frank 2005-05-13 Memory limit for operators - stack trace ================================================================================ Dear all, I have documented some new (or old but undocumented) configuration options in SecondoConfig.example. I will mention two important things here: 1) There is a new section QueryProcessor: # --- QueryProcessor Section --- [QueryProcessor] # Max memory in kb available for an operator (e.g. hashjoin or sort) #MaxMemPerOperator=4096 if the parameter above is set, the memory available for operators can be defined. Before, it was hard coded, e.g. hashjoin 16MB and other operators like product only 2MB. This seems to be a little bit unfair when one tries to compare two algorithms. Moreover, we can test with small inputs if the persistent implementation of the algorithms work. It turned out that the sortmergejoin has a problem, since it chrashes with a segmentation fault in some of my queries when the memory is less than 4MB. Maybe some branches of code were called which have never been called before. By default every operator will have 16MB now. 2) Stack Trace I have improved the output of the stacktrace. On a linux system (if compiled with debugging information [-ggdb]) we will see complete function and file names instead of mangled C++ Symbols now. # Uncomment the next line if you don't want # to see a stack trace when Secondo chrashes. # Note: The stack trace is not available on windows! RTFlags += DEBUG:DemangleStackTrace Output Example: ******************************************** ** ** Signal #SIGSEGV caught! Printing Stack ... ** ******************************************** ?? --> [ ??:0 ] Application::PrintStacktrace() --> [ Application.cpp:241 ] Application::AbortOnSignalHandler(int) --> [ Application.cpp:317 ] ?? --> [ ??:0 ] ?? --> [ ??:0 ] Tuple::~Tuple() --> [ RelationPersistent.cpp:558 ] Tuple::DeleteIfAllowed() --> [ RelationAlgebra.h:412 ] MergeJoinLocalInfo::ClearBucket(std::vector >&) --> [ ExtRelAlgPersistent.cpp:710 ] MergeJoinLocalInfo::NextResultTuple() --> [ ExtRelAlgPersistent.cpp:940 ] int MergeJoin(Word*, Word&, int, Word&, void*) --> [ ExtRelAlgPersistent.cpp:1068 ] Operator::CallValueMapping(int, Word*, Word&, int, Word&, void*) --> [ Algebra.h:181 ] AlgebraManager::Execute(int, int, Word*, Word&, int, Word&, void*) --> [ AlgebraManager.h:661 ] QueryProcessor::Eval(OpNode*, Word&, int) --> [ QueryProcessor.cpp:2682 ] QueryProcessor::Request(void*, Word&) --> [ QueryProcessor.cpp:2792 ] Head(Word*, Word&, int, Word&, void*) --> [ ExtRelationAlgebra.cpp:1020 ] Operator::CallValueMapping(int, Word*, Word&, int, Word&, void*) --> [ Algebra.h:181 ] AlgebraManager::Execute(int, int, Word*, Word&, int, Word&, void*) --> [ AlgebraManager.h:661 ] QueryProcessor::Eval(OpNode*, Word&, int) --> [ QueryProcessor.cpp:2682 ] QueryProcessor::Request(void*, Word&) --> [ QueryProcessor.cpp:2792 ] TCountStream(Word*, Word&, int, Word&, void*) --> [ RelationAlgebra.cpp:1864 ] Operator::CallValueMapping(int, Word*, Word&, int, Word&, void*) --> [ Algebra.h:181 ] AlgebraManager::Execute(int, int, Word*, Word&, int, Word&, void*) --> [ AlgebraManager.h:661 ] QueryProcessor::Eval(OpNode*, Word&, int) --> [ QueryProcessor.cpp:2682 ] SecondoInterface::Command_Query(AlgebraLevel, unsigned long, unsigned long&, std::string&) --> [ SecondoInterface.cpp:1344 ] SecondoInterface::Secondo(std::string const&, unsigned long, int, bool, bool, unsigned long&, int&, int&, std::string&, std::string const&) --> [ SecondoInterface.cpp:1129 ] SecondoTTY::CallSecondo() --> [ SecondoTTY.cpp:590 ] SecondoTTY::CallSecondo2() --> [ SecondoTTY.cpp:623 ] SecondoTTY::ProcessCommand() --> [ SecondoTTY.cpp:305 ] SecondoTTY::ProcessCommands() --> [ SecondoTTY.cpp:443 ] SecondoTTY::Execute() --> [ SecondoTTY.cpp:921 ] main --> [ SecondoTTY.cpp:1068 ] ?? --> [ ??:0 ] _start --> [ start.S:105 ] *********** End Stack ********************** Regards Markus 2005-03-03 Problems with the bison parser generator ================================================================================ Dear all, I finished to merged in the Picture Algebra devolped by students of the practical course in database systems. It is possible to import JPEG, TGA and PCX pictures. Sample pictures and an import command file can be found in the secondo-data CVS repository. A nice viewer is also present. Besides picture a data type histogram is implemented, which represents color distributions in the RGB color scheme. By default this algebra is not active. If you switch it on the parser generator bison has a problem since the table size of 32767 will be exceeded. I found no switches to resize this table, hence I decided to use a newer version of bison which works fine. On linux you can easily switch to a newer version if you edit the makfile.linux and set YACC=/usr/bin/bison On windows you need to download a newer version from gnuwin32.sourceforge.net. Bye Markus 2005-02-09 Secondo-SDK Configuration =============================================================================== Dear all, I have revised the environment setup for SECONDO. The next time when you update and run make the files .secondorc .secondo.linuxrc (or .secondo.win32rc) .secondo.sdkrc will be created in your home directory. The file .bashrc should simply call source $HOME/.secondorc [secondo-root-dir] [secondo-sdk-dir] this should be already done in all installations. By default $HOME/secondo is assumed to be the directory where the SECONDO sources are present. If not so you can pass the directory as optional argument. The second parameter can overrule the default directory SECONDO_SDK=/home/secondo-sdk. This is interesting for global installations like we have on zeppelin. Since the automatic detection of the Berkeley-DB version was very slow on windows-msys I removed it. If you have already installed a newer Berkeley-DB please set up the directory in the file ".secondo.sdkrc". Moreover, you may have to change the CVSROOT variable it is also defined there. In most cases this should be the only file to change. I hope, that now the environment configuration is (1) better to maintain (2) better to understand (3) has more verbose error messages in case of missing directores, etc Bye Markus P.S.: When you have more than one SECONDO source trees on your computer you can simply change the environment by calling "setvar" (without parameter $PWD) in the root of a SECONDO source tree. 2005-02-07 The aggregate operator =============================================================================== Hi all, I added in the Extended Relational Algebra a new operator called 'aggregate'. I am sending you the description of the changes I made in the CVS. []s Victor ---------- In this version four modifications were made: - The operators unionbbox and intersectionbbox were removed from the Rectangle Algebra, because - The operator aggregate was added in the Extended Relational Algebra, and - The operators union and intersection were added in the Rectangle Algebra. - The specification of the translate operator was changed. With that, general aggregate operations can be done in relations, for example, the unionbbox can now be rewritten as query Kreis_box feed aggregate[box; fun(r1: rect, r2: rect) r1 union r2; [const rect value undef]] where the first argument box is an attribute of type rect in the relation Kreis_box; the second is the aggregate function that operates in two elements; and the third argument is the empty value to be calculated with the first tuple of the relation. As another example, the sum operation can now be rewritten as: query ten feed aggregate[no; fun(i1: int, i2: int) i1+i2; 0] ---------- 2005-01-31 Automatic Tests =============================================================================== Hello all, I created some new scripts which automatically run all available tests through the TestRunner. Moreover, I created a new directory Tests/Testspecs and I moved all tests there. You can simply add new testfiles there and if they end with ".test" they will be recognized by the script. You can call the script by make runtests Please do this before you check in changes in important modules!!! The test creates his own berkeley db database directory The output of the tests will be stored in Tests/Testspecs/.log Moreover, every night a cron job retrieves a CVS copy, compiles SECONDO and runs these tests. All people who commited since the last succesful run will get an email if any error happens. Currently, the oldrelalg.test fails with a segmentation fault. Bye Markus 2005-01-25 Berkeley-DB Release change =============================================================================== Dear all, I'm sorry but unfortunately, my instructions were misguiding. The problem is, that the directory for the Berkeley-DB was hard coded into the setvar script and I told you another name. Hence, even if you have installed a new Berkeley-DB it will not be used. With the command 'catvar' all important used directories are displayed. If BERKELEY_DB_DIR is not the directory where you have installed it, still the old verision will be used. Now I fixed the problem and the only restriction is that the library must be installed below directory $SECONDO_SDK If you already have installed Berkeley-DB 4.2.52. do 1) cvs update 2) make update environment 3) close all shells and open a new one 4) make clean; make 5) restore databases If you haven't done it already, the old instructions will work. 2005-01-24 Berkeley-DB Release change =============================================================================== Dear all, I changed some SMI code in order to make it possible that you can use SECONDO also with Berkeley DB version 4.2.52. This version has native mingw support, hence it is easier to install with gcc on windows. Since version 4.2.52 is intented to be used for the CeBit-Version, please follow the instructions below and install Berkeley-DB 4.2.52 as soon as possible, because there are only 3 weeks left for testing this version. Here are the upgrade instructions: 1) update Secondo (cvs update) 2) run make update-environment 3) close all open shells and start a new one 4) Download Berkeley-DB 4.2.52 (without encryption) from www.sleepycat.com or if you have access from zeppelin:/home/secondo/SECONDO_CD/{linux,windows}/ non-gnu 5) Extract the distribution somewhere and run the following commands a) cd db-4.2.52/build_unix b) ../dist/configure --enable-mingw --enable-cxx --prefix=$SECONDO_SDK/db4252 Note: The switch enable-mingw should only be used on windows c) make d) make install 6) Save your current databases if you think this is necessary 7) Now do a make clean on your secondo directory and call make again 8) Delete the database directory and restore your databases. If you have any problems with the procedure above, please contact me Best Regards Markus 2004-12-22 Nested Lists =============================================================================== Dear all, since Zhiming had problems with nested lists containing German "Umlaute" like � �etc. I revised the scanner specifications in order to make them more secure. However, the problem with "Umlaute" seems to be only a problem of the Java based nested list scanner and is still present. Besides some other improvements to the nested list parsing were done and explained below: Changes in the C++ and Java Scanner: -Special characters like \t \v \a \b, etc. will be overwritten. -Moreover there should be no problem to interchange nested list files between linux and windows. Changes in C++: -The error message was improved. The position of the character causing an error and the last token name will be displayed. -The scanner and parser can be switched into a debug mode displaying many useful information. This can be done by a new command called "set", For example (set "NLParser:Debug" = TRUE); will turn on the debug mode for the parser. Note: The command is only recognized in nested list syntax. Moreover the command can be used to change the RTFlags defined in the SecondoConfig.ini at runtime. Currently only boolean values are supported. However, some of them are not really runtime parameters since they are only used at startup of the system, hence changing them later is meaningless. The flag "NLScanner:Debug" will be used to control the scanners output. If both are switched on the output is very noisy and not easy to understand since scanning and parsing are interleaved. Bye Markus =============================================================================== There are some more messages but I think some of them are out of date now. Below you will find those which I thought moght be most interesting 2004-11-10 Counter and Runtime Information =============================================================================== ello, if the two RTFLAGS SI:CommandTime SI:PrintCounters are set in SecondoConfig.ini the files cmd-times.csv cmd-counters.csv will be created. This is useful to import data into spreadsheets programs, e.g. MS-Excel. Before you import the counters.csv file it may be necessary to edit it, since the number of counters depends on the executed code hence on the commands you type in. Look at the exaples below: cmd-times.csv: ------------------------ #nr|command|realtime|cpu-time 1|list databases|0|0.01 2|create database testqueries|2|0.04 3|restore database testqueries from testqueries|3|0.1 cmd-counters.csv: --------------------------- 1 SmiFile::Close|SmiFile::Open 2|9|8 SmiFile::Close|SmiFile::Create|SmiFile::Open|SmiFile:Realloc-DBHandles| SmiRecord::Write 3|27|18|17|16|422 4|9|0|0|0|0 The first column contains the number of the command and the following columns the values of the counters. If the number of counters changes a new headline will be printed. List databases produces no output of counters, hence there is no headline. Information about using counters in your code will be found in the file include/Counter.h Bye Markus 2004-07-30 B-Tree indexing of conplex types =============================================================================== Hi all, I have just finished a new approach for indexing complex data types using B-Trees. The idea is to use a string representation of the objects that preserve the ordering. For that, a new abstract class is created, namely IndexableStandardAttribute, which has three functions: one for writing the value into a string (char *), one for reading the value from a string, and finally one for returning the size of the string representation of the object in bytes. The data type class must implement this abstract class in order to be indexable by B-Trees. For the Secondo type checking, I have created a kind called INDEXABLE which the data types must belong. As an example, one can see the DateTime algebra. []'s Victor 2004-07-21 New Operators available =============================================================================== Dear All, I've implemented 3 new operators, "units", "theyear", and "themonth" in Secondo. The "units" operator didn't work at first because of an error with the definition of the UPoint class. Now Markus has fixed it and it works fine. The signatures of these OPs are as follows: ------units------(to transform moving data into stream of moving units) mpoint ->stream(upoint) mint->stream(constint) mreal->stream(ureal) -----theyear-----(to get a periods value from the indicated year) int->periods ----themonth---(to get the periods value from the indicated month) int x int ->periods Example queries are as follows: query U15 feed extendstream[ MUnit: units(.zug) ] consume; query theyear(2000) query themonth(2000, 3) Best Regards, Zhiming 2004-07-12 Restoring large objects =============================================================================== Dear all, some time ago, I explained how to configure make to create a version of Secondo which uses a persistent implementation of the Nested-List module capable to restore large objects. The changes made there address two things (1) The variable NL_PERSISTENT in the makefile.env which switches between persistent and in memory representation. (2) This variable was also used to define a Berkeley-DB mode without logging and transactions. The latter was now changed to a runtime flag. If you define the flag SMI:NoTransactions in the SecondoConfig.ini file (the RTFlags key). Secondo will startup the SMI without transactions and logging. This mode can be useful for restoring databases or to see which overhead logging and transactions causes during query processing You can also use the version with the NL_PERSISTENT flag permanently since for the most commands the buffer of the Nested-List module is big enough and hence no additional disk I/O is needed. I tested it for a while and I think it runs stable now. However, each time when you change the NL_PERSISTENT key, run make clean before building Secondo again. This is due to the fact that the Nestedlist module is used nearly everywhere in the system and I'm not sure if the make files handles all dependencies correct. Bye Markus 2003-11-07 About makefiles =============================================================================== Hello all, I have changed some makefiles and created some new ones since some of them were too complex. Now the structure is as follows (-> indicates a include relation) makefile -> makfile.env The top level makefile should only call other makfiles in an appropriate order. The rules for creating libraries were moved to a file makefile.libs. Every sub level makefile includes makefile.env which contains all basic definitions such as names of tools, directories for searching includes, etc. makfile.env ->makefile.jni ->makefile.algebras ->makefile.optimizer ->makefile.{.linux,.win32} makefile.algebras is used for the definition of algebra names and directories where their source code resides. This makes ist more convenient to switch on/off algebras since you have only to comment out two lines and edit the file /Algebras/Management/AlgebraList.i. You don't have to change the file Algebras/makefile anymore. I corrected also the dependencies so that you don't need to do a "make clean" after Algebra re-configuration. But be careful, there may be some interdependencies between Algebras. Every algebra implementor who knows about dependencies should document them in this file. makefile.jni contains information about JNI-Algebras written in Java. Currently, this doesn't work properly but will be corrected next week. The compilation of JNI-Algebras is controlled by the macro USE_JNI in the makefile.env makefile.optimizer contains information about the Prolog and JPL installation. If this file detects that Prolog is installed (this is determined by some environment variables which have to be set manually after prolog installation) the optimizer will be compiled. Additonally, the JPL based Optimizer Server will be compiled. Finally, here are some installation instructions: MS-Windows: The Berkeley-DB will now be installed at a global place in the file system, therefore you will have to change some environment variables. When you have already installed Secondo from the CD-ROM and you want to make a new inital copy from the cvs server do the following: cvs update secondo cd secondo/Win32 make install cd MSYS make install notepad /etc/setvar.bash Set up your Prolog and J2SDK installation directories. After you have done this you will never have to change to the Win32 directory and run "make install" anymore. Linux: Replace your setvar.bash script with the version in the secondo directory Maybe it was renamed try which setvar to determine the location. Otherwise look into your .profile or .bashrc shell configuration file. Adjust the Prolog and J2SDK directories. If you have problems, please contact me. Bye Markus 2004-04-30 =============================================================================== Hello all, I have commited the following changes to CVS 1) new operators seqinit: int -> bool and seqnext: -> int implemented in the StandardAlgebra. These operators allow to create a sequence of numbers. After startup the Sequence counter is set to 0. With seqinit it can be resetted to an arbitrary integer value. The seqnext() can be used in conjunction with the extend operators to add unique or uniformly distributed attribute values (seqnext() mod N) to relations 2) The random number generation in the randint and sample operator are revised as recommended in the man page documentation of the rand() function. In C++ code always use the computation of rand()/(RAND_MAX+1.0) to create floating values in the range [0,1) and multiply them by a constant N to create values in the range [0,N-1]. On Windows the library dependent constant RAND_MAX is limited to 32.000 which is a very small value. With the modification above the sample operator will create uniformly distributed numbers but limited to a total number of samples of 3*RAND_MAX/4 to avoid long runtimes. 3) In case of abnormal program termination a stack trace will be printed on the screen (linux version only). This may help to determine the origin of trouble at a first glance. Bye Markus