首页 FD_kdb+tick_manual_1.0

FD_kdb+tick_manual_1.0

举报
开通vip

FD_kdb+tick_manual_1.0First Derivatives plc Kdb+/tick Manual About the Authors Arthur Whitney Arthur Whitney is CTO and Co-Founder of Kx Systems. Prior to founding Kx, he was a Managing Director of Union Bank of Switzerland (UBS) in New York, where he led an internal team that deve...

FD_kdb+tick_manual_1.0
First Derivatives plc Kdb+/tick Manual About the Authors Arthur Whitney Arthur Whitney is CTO and Co-Founder of Kx Systems. Prior to founding Kx, he was a Managing Director of Union Bank of Switzerland (UBS) in New York, where he led an internal team that developed global trading and risk management systems using the K language. Prior to UBS, Arthur was at Morgan Stanley & Co., where he developed the A+ programming language, used to build trading systems, databases and analytics for equities and fixed income. He studied set theory, foundations and computational complexity at the University of Toronto and Stanford. Brian Conlon Brian Conlon is CEO of First Derivatives plc. He trained with KPMG before joining the Risk Management team in Morgan Stanley International in London. He then joined SunGard as a capital markets consultant. During his time with SunGard he worked with more than 60 financial institutions worldwide. He left in 1996 to set up First Derivatives. Michael O’Neill Michael O'Neill is COO of First Derivatives plc. Prior to joining First Derivatives Michael spent 8 years in the actuarial industry with Lloyds Abbey Life. As manager of Single Premium Products had a key role in the design, development and marketing of derivative based investment products. Left to join FD in 1997. Michael now oversees the Kx sales effort in NYC and Europe. Peter Durkan Peter Durkan is a financial engineer with extensive experience in risk management and proprietary and third party financial software systems. He is now a key member of First Derivatives Kx team and has been involved in projects at some of the world’s largest investment banks, helping them optimize performance in key areas such as trading strategies, risk analytics and portfolio management. About First Derivatives and Kx Systems About First Derivatives First Derivatives plc (www.firstderivatives.com) is a recognised and respected service provider with a global client base. FD specialises in providing services to both financial software vendors and financial institutions. Based in the UK, the company has drawn its consultants from a range of technical backgrounds; they have industry experience in equities, derivatives, fixed income, fund management, insurance and financial/mathematical modeling combined with extensive experience in the development, implementation and support of large-scale trading and risk management systems. About Kx Systems Kx Systems (www.kx.com) provides ultra high performance database technology, enabling innovative companies in finance, insurance and other industries to meet the challenges of acquiring, managing and analyzing massive amounts of data in real-time. Their breakthrough in database technology addresses the widening gap between what ordinary databases deliver and what today's businesses really need. Kx Systems offers next-generation products built for speed, scalability, and efficient data management. Strategic Partnership First Derivatives have been working with Kx technology since 1998 and are one of two accredited partners of Kx Systems worldwide. FD plc deals with all queries in relation to Kx products for the financial sector worldwide and the EMEA market in general. First Derivatives offers a complete range of Kx technology services: ​ Proof of Concepts (FREE) ​ Training (K,KSQL, Kdb DBA) ​ Systems Architecture & Design ​ K, KSQL development resources ​ Kdb+/tick implementation and customization ​ Database Migration ​ Production Support First Derivatives Services First Derivatives team of Business Analysts, Quantitative Analysts, Financial Engineers, Software Engineers, Risk Professionals and Project Managers provide a range of general services including: ​ Financial Engineering ​ Risk Management ​ Project Management ​ Systems Audit and Design ​ Software Development ​ Systems Implementation ​ Systems Integration ​ Systems Support ​ Beta Testing Contact: North American Office (NY): +1 212-792-4230 European Office (UK): +44 28 3025 4870 Michael O’Neill, Chief of Operations: moneill@firstderivatives.com Victoria Shanks, Business Development Manager: vshanks@firstderivatives.comks@firstderivatives.com TOC \o "1-3" \h \z About the Authors 2 About First Derivatives and Kx Systems 3 Kdb+/tick Architecture 7 Basic Overview 8 Feed Handler 9 Ticker-plant 10 Real-Time Subscribers 11 Chained Ticker-plants 12 Historical Database 13 Implementing kdb+/tick 14 Installation 15 A Brief Description of the Scripts 15 The Ticker-plant System 17 Starting the Ticker-plant 17 Ticker-plant Configuration 18 Feed Handler Configuration 20 Using multiple ticker-plants 22 Performance 23 Kdb+ memory usage 24 Real-Time Subscribers 26 Kdb+ Real-Time Databases 26 Performance 29 Failure Management 30 Backup and Recovery 30 Failover and Replication 30 Ticker-plant Failure 30 Real-time Database Recovery 31 Replicated Databases 31 Data feed failover 32 Multiple Ticker-plants 32 Hardware Failure 32 Appendices 33 Appendix A: Troubleshooting Kdb+/tick and Kdb+/taq 33 Appendix B: Technical Implementation of Ticker-plant 35 Appendix C: Custom Ticker-plants 37 Appendix D: The Reuters Feed Handler 38 Kdb+/tick Architecture The diagram below gives a generalized outline of a typical Kdb+/tick architecture, followed by a brief explanation of the various components and the through-flow of data. Basic Overview ​ The Ticker-plant, Real-Time Database and Historical Database are operational on a 24/7 basis. ​ The data from the data feed is parsed by the feed handler. ​ The feed handler publishes the parsed data to the ticker-plant. ​ Immediately upon receiving the parsed data, the ticker-plant publishes the new data to the log file and updates its own internal tables. ​ On a timer loop, the ticker-plant publishes all the data held in its tables to the real-time database and publishes to each subscriber the data they have requested. The ticker-plant then purges its tables. So the ticker-plant captures intra-day data but does not store it. ​ The real-time database holds the intra-day data and accepts queries. ​ In general, clients which need immediate updates of data (for example custom analytics) will subscribe directly to the ticker-plant (becoming a real-time subscriber). Clients which don’t require immediate updates, but need a view the intra-day data will query the real-time database. ​ A real-time subscriber can also be a chained ticker-plant. In this case it receives updates from a ticker-plant (which could itself be a chained ticker-plant) and publishes to its subscribers. This will reduce latency through the system. ​ At the end of the day the log file is deleted and a new one created, and the real-time database saves all it’s data to the historical database and purges its tables. Feed Handler The feed handler is specialized Kdb+ process which connects to data feed and it retrieves and converts the data from the feed specific format into a Kdb+ message which is published to the ticker-plant process. Ticker-plant The core component of Kdb+/tick is the ticker-plant database, a specialized Kdb+ application that operates in a publish & subscribe configuration. The ticker-plant acts as a gateway between a data feed and a number of subscribers, by performing the following operations: 1.​ Receives the data from the feed handler. The ticker-plant stores the data in memory for a shore period of time which is configurable. 2.​ It logs updates to disk for recovery from failure and updates any subscribers. 3.​ The clients subscribe to the ticker-plant rather than the real-time database. Once subscription has been made, the client will receive all subsequent updates. The real-time database is merely a replication of all the data in the log file. 4.​ At day end, the ticker-plant sends an end-of-day message to the real-time database which causes the real time database to save all the intra day data to the historical database and reset its tables. This message is also sent to all subscribers which can act on it accordingly. The log is also deleted, and a new one created. 5.​ The effect of is this is that the ticker-plant, the real-time database and the historical database are operational on a 24/7 basis. 6.​ The latency between the feed and the data being written to the log is less than 1 millisecond. Although Kdb+/tick comes in a number of pre-defined configurations, practically all of the described operations can be fully customized to handle different types of data. Since the ticker-plant is a Kdb+ application, its tables can be queried using q like any other Kdb+ database. However, to ensure fail-safe and real-time operation, it is advisable that the ticker-plant is only queried directly for testing and diagnostic purposes. All ticker-plant clients should only have access to the database as subscribers, and these Kdb+ subscribers (see next section) used as database servers. Real-Time Subscribers Real-time subscribers are processes that subscribe to the ticker-plant and receive updates on the requested data. In general these should be started at the same time as the ticker-plant to capture all of the data for the day, though they can be started later to subscribe for all future updates and possibly to retrieve all of the data collected by the ticker-plant up to that point from the real-time database. Typical real-time subscribers are Kdb+ databases that process the data received from the ticker-plant and/or store them in local tables. The subscription, data processing, and schema of a real-time database can be easily customized. Kdb+/tick includes a set of default real-time databases, which are in-memory Kdb+ databases that can be queried in real-time, taking full advantage of the powerful analytical capabilities of the q language and the incredible speed of Kdb+. Each real-time database subscribing to the ticker-plant can support hundreds of clients and still deliver query results in milliseconds. Clients can connect to a real-time database using one of the many interfaces available on Kdb+, including C/C++, C#, Java and the embedded HTTP server, which can format query results in HTML, XML, TXT, and CSV. Multiple real-time databases subscribing to the ticker-plant may be used, for example, to off-load queries that employ complex, special-purpose analytics. The update data they receive may simply be used to update special-purpose summary tables. Real-time subscribers are not necessarily Kdb+ databases. Using one of the interfaces above or just plain TCP/IP socket programming, custom subscribers can be created using virtually any programming language, running on virtually any platform. Chained Ticker-plants Real-time subscribers can also be chained ticker-plants. This means that they have subscribers themselves which they publish updates to. The use of chained ticker-plants reduces the latency of data through the paths of the system. It is likely that each chained ticker-plant would operate on a subset of data from its parent ticker-plant, and do some calculations on this data. In this way, each subscriber in the chain will be acting on an up-to-date a set of processed data. Historical Database The real-time database can be configured to execute an end-of-day process that transfers all the collected data into a historical database. The historical database is a partitioned database composed of a collection of independent segments, any subset of which comprise a valid historical database. The database segments can all be stored within one directory on a disk, or distributed over multiple disks to maximize throughput. The historical database is partitioned by date, and each database segment is a directory on disk whose name is the date corresponding to the unique date on all data in that segment. A query of the historical database is processed one segment at a time, possibly in parallel by multiple processes working on different disks. The historical database layout can easily be customized, as can it’s stored procedures and specialized analytics. Kdb+/tick is provided in different default configurations according to the type of data collected by the ticker-plant. Implementing kdb+/tick The table below outlines the main steps in a standard kdb+/tick implementation with cross references to other parts of this manual. It is not exhaustive but should give an indication of the main areas to consider. Task Details and Manual References Install kdb+ and kdb+/tick Installation Configure the ticker-plant Define the database schema and define and activate the connection to the (various) datafeed(s). Kdb+/tick comes with a number of predefined configuration scripts including two basic equity ticker plants (TAQ and SYM), the Level 2 Ticker-plant and a futures ticker-plant. (Ticker-plant Configuration) The default handler of kdb+/tick is Reuters ssl but custom feeds and schema can be built. (Custom Ticker-plants) Managing the ticker-plant in production Personnel tasked with managing the ticker-plants should get some understanding of how the database is partitioned and some of the conventions used (The Ticker-plant). Further consideration will need to be given to issues of scheduling startups and performance optimization (Performance). Real-time database subscribers Kdb+/tick can be configured to update a number of real-time subscribers. (Real-time subscribers) Historical Database Issues Historical Database Making use of the ticker-plant The installation of kdb+/tick is normally designed to take advantage of the power of q. There may be some requirement to use analytics or interfaces in other languages such as C++ or .net. Multiple ticker-plants Using multiple ticker-plants Installation To install kdb+/tick you need to have a valid licensing agreement with KX Systems. The installation and license files for Kdb+/tick must be obtained directly from Kx Systems. The license file ‘k4.lic’ must be copied into the KDB+ installation directory. The Kdb+/tick distribution file is called ‘tick.zip’, and contains the ticker-plant core and the configuration scripts for a variety of ticker-plant, real-time, and historical databases. To install, simply extract the contents of the zip archive under the k4/ directory. Prior to installing Kdb+/tick and Kdb+/taq, Kdb+ must also be installed on the system. On windows the default installation directory is “C:\k4” and under Solaris or Linux is “$HOME/k4”. This location can be controlled via the “KHOME” environment variable. For example users with the Windows operating system should unzip this file and install the contents in the C:\k4 folder. Directory/file Purpose C:\k4\tick This contains all the q feedhandler and client code. The code within this folder may need to be modified for a number of purposes e.g., ​ taq.txt/sym.txt could be modified to capture different Reuters fields ​  the actual schema scripts sym.q/taq.q which defines the table structure may also need to be changed.  ​ ssl.q may need to be modified to provide for different feedhandlers. ​ It may also be necessary to add to some of the default subscribers C:\k4\tick.k This is the module containing all the ticker-plant functionality. On certain occasions this will need to be modified to meet your customized requirements. For simplicity, the $HOME/k4/ and C:\k4 directories will be indicated as the k4/ directory in the remainder of this document. Also, we will use the “/” as the path separator. Please note the on Windows this is “\” (back-slash). The path from which the commands are executed, instead, will be indicated as the working directory. A Brief Description of the Scripts *​ tick.k The main ticker-plant script which contains functionality for publishing and subscribing. This receives the data from the feedhandler, immediately updates the log, and publishes to the real-time database and all subscribers on a predefined heartbeat. *​ ssl.q (the feedhandler) This script receives the raw data from the feed, parses it and sends it to the ticker-plant (tick.k). It connects to the feed by dynamically loading a c library containing functions for subscribing to Reuters. This can be configured for many different types of feed by making a few changes to the parsing rules. *​ r.k This is the real-time database (RDB) which maintains a complete view of the intra day data. On subscription the RDB loads the days tick data up to that point from the log on disk, and then continues to receive updates via TCP/IP. In this way RDB can subscribe at any time during the day without overloading or delaying the plant. *​ The scripts which define the schema for the ticker-plant and just contain the table definitions. taq: trade and quote data sym: simplified trade and quote data fx: Forex data lvl2: level2 data *​ feedsym.k This is a simulated sym feed for testing the ticker-plant and would correspond to the sym.q schema. *​ sub.k/u.k These scripts allow for 'chained subscriber implementations' which means that any subscriber to the ticker-plant can itself be a publisher/subscriber server- just like the original ticker-plant. *​ c.q This script contains numerous easily configured sample ticker-plant subscribers. The Ticker-plant System Starting the Ticker-plant A ticker-plant system usually has the ticker-plant, real-time db, historical db, one or more feeds and several clients. The file test.q included in the tick directory contains a script to start a ticker-plant system. (Note: This can be used only on Windows. For Solaris and Linux is should be changed to reflect a proper terminal starting command.) \start q tick.k sym . -p 5010 \start q tick/r.k 5010 -p 5011 \start q ./sym -p 5012 \start q tick/c.q vwap 5010 \start q tick/ssl.q sym 5010 Explanation q tick.k sym . -p 5010 This line starts the ticker-plant, using the table schema tick/sym.q. The general form of it is q tick.k SRC DST [-p 5010] [-t 1000] [-o hours] SRC specifies the schema to be loaded. sym refers to tick/sym.q and is also the default value. DST specifies the location of the log file, either locally or remotely, and the real-time replication server. The log will have the path `:DST/symYYYY.MM.DD, i.e. the schema type with the current date appended, at the location specified by DST. If DST is not specified, a log is not created and the RDB is not used. The “.” in the line specified in test.q refers to the current directory. The –p option enables a q-IPC server listening for incoming subscriptions on the specified TCP/IP port. If no q-IPC port is specified the default port of 5010 is used The -t option sets the update interval used by the ticker-plant. This value defines the frequency in seconds of how often the ticker-plant publishes data to its real-time subscribers. The update interval defaults to 1000ms(1 sec). A smaller interval value can be used and will lower latency however one must remember that this can significantly increase CPU usage so it is necessary to monitor the effect of any change to this setting to ensure that there is no risk of the processor falling behind during period of high market activity. The –o is the offset in hours from GMT. This defaults to 0. q tick/r.k 5010 -p 5011 This line starts the RDB on the port specified by –p. The 5010 specifies the port on which the ticker-plant is running and tells the RDB which port to connect to receive real-time updates. q ./sym -p 5012 This starts the historical db on the port specified by –p. The general form of it is q DST/SRC –p 5012 Both DST and SRC should be the same as those specified in the ticker-plant start up line. This is because the ticker-plant will save the historic data to this location, so the historic db should run from the data found in this location. Configuration Kdb+/tick comes with a number of pre-defined ticker-plant schemas, including two basic equity ticker-plants (TAQ and SYM), the Level 2 ticker-plant and an FX ticker-plant. The default feed handler of Kdb+/tick (Reuters SSL) is used by all four configurations. Whichever s
本文档为【FD_kdb+tick_manual_1.0】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。
下载需要: 免费 已有0 人下载
最新资料
资料动态
专题动态
is_447702
暂无简介~
格式:doc
大小:301KB
软件:Word
页数:0
分类:互联网
上传时间:2014-01-30
浏览量:29