Apache httpd v2.4: Hello Cloud Jim Jagielski
What we will cover • Performance Related Enhancements • Reverse Proxy Server Enhancements
Apache httpd 2.4 Currently in beta release Expected GA: This May! Significant Improvements high-performance cloud suitability
Apache httpd 2.4 Support for async I/O w/o dropping support for older systems Larger selection of usable MPMs: added Event, Simple, etc... Leverages higher-performant versions of APR
Apache httpd 2.4 Bandwidth control now standard Finer control of timeouts, esp. during requests Controllable buffering of I/O Support for Lua
Apache httpd 2.4 Reverse Proxy Improvements Supports FastCGI, SCGI Additional load balancing mechanisms Runtime changing of clusters w/o restarts Support for dynamic configuration
mod_proxy • • • •
An Apache module Implements core proxy capability Both forward and reverse proxy In general, most people use it for reverse proxy (gateway) functionality
How did we get here? • A stroll down mod_proxy lane – First available in Apache 1.1 • “Experimental Caching Proxy Server” – In Apache 1.2, pretty stable, but just HTTP/1.0 – In Apache 1.3, much improved with added support for HTTP/1.1 – In Apache 2.0, break out cache and proxy – In Apache 2.2, lay framework
Proxy Improvements • Becoming a robust but generic proxy implementation • Support various protocols – HTTP, HTTPS, CONNECT, FTP – AJP, FastCGI, SCGI, WSGI (soon) – Load balancing • Clustering, failover
AJP? Really? • Yep, Apache can now talk AJP with Tomcat directly • mod_proxy_ajp is the magic mojo • Other proxy improvements make this even more exciting • mod_jk alternative
But I like mod_jk • That’s fine, but... – Now the config is much easier and more consistent ProxyPass /servlets 8089
ajp://tc.example.com:
– Easier when Apache needs to proxy both HTTP and AJP – Leverage improvements in proxy module
Features of Proxy Server • Performance • Monitoring • Filtering • Caching (with mod_cache)
Reverse Proxy • Operated at the server end of the transaction • Completely transparent to the Web Browser – thinks the Reverse Proxy Server is the real server Reverse Proxy Server
Cloud
Internet Browser
Firewall
Firewall
Transactional Servers
Features of Reverse Proxy • Security – Uniform security policy can be administered – The real transactional servers are behind the firewall • Delegation, Specialization, Load Balancing
Configuring Reverse Proxy • Set ProxyRequests Off • Apply ProxyPass, ProxyPassReverse and possibly RewriteRule directives
Reverse Proxy Directives: • Allows remote server to be mapped into the space of the local (Reverse Proxy) server • Example: – ProxyPass /secure/ http://secureserver/
– Presumably “secureserver” is inaccessible directly from the internet
Reverse Proxy Directives: • Used to specify that redirects issued by the remote server are to be translated to use the proxy before being returned to the client. • Syntax is identical to ProxyPass; used in conjunction with it • Example: – ProxyPass /secure/ http://secureserver/
Simple Rev Proxy • All requests for /images to a backend server •
ProxyPass /images http://images.example.com/
•
ProxyPass <path> <scheme>://<full url>
• Useful, but limited • What if: – images.example.com dies? – traffic for /images increases
Baby got back • We need more backend servers • And balance the load between them • Before 2.2, mod_rewrite was your only option • Some people would prefer spending an evening with an Life Insurance salesman rather than deal with mod_rewrite
Load Balancer • mod_proxy_balancer.so • mod_proxy can do native load balancing – weight by actual requests – weight by traffic – weight by busyness – lbfactors
Load Balancer • LB algorithms are implemented as providers – easy to add – no core code changes required – growing list of methods
Load Balancer • Backend connection pooling • Available for named workers: – eg: ProxyPass /foo http://bar.example.com • Reusable connection to origin – For threaded MPMs, can adjust size of pool (min, max, smax) – For prefork: singleton • Shared data held in shared memory
Pooling example <Proxy balancer://foo> BalancerMember http://www1.example.com:80/
loadfactor=1
BalancerMember http://www2.example.com:80/
loadfactor=1
BalancerMember http://www3.example.com:80/ status=+h
loadfactor=4
ProxySet lbmethod=bytraffic </Proxy>
Load Balancer • Sticky session support – aka “session affinity” • Cookie based – stickysession=PHPSESSID – stickysession=JSESSIONID • Natively easy with Tomcat • May require more setup for “simple” HTTP proxying
Load Balancer • Cluster set with failover • Group backend servers as numbered sets – balancer will try lower-valued sets first – If no workers are available, will try next set • Hot standby
Example <Proxy balancer://foo> BalancerMember http://php1:8080/
loadfactor=1
BalancerMember http://php2:8080/
loadfactor=4
BalancerMember http://phpbkup:8080/ loadfactor=4 status=+h BalancerMember http://offsite1:8080/ lbset=1 BalancerMember http://offsite2:8080/ lbset=1 ProxySet lbmethod=bytraffic </Proxy> ProxyPass /apps/ balancer://foo/
Embedded Admin • Allows for real-time – Monitoring of stats for each worker – Adjustment of worker params • lbset • load factor • route • enabled / disabled • ...
Embedded Admin • Allows for real-time • Addition of new workers/nodes • Change of LB methods • Can be persistent • More RESTful • Can be CLI-driven
Easy setup <Location /balancer-manager> SetHandler balancer-manager Order Deny,Allow Deny from all Allow from 192.168.2.22 </Location>
Admin
Admin
Admin
Changing the LBmethod Adding new worker
Admin
Some tuning params • For workers: – loadfactor • normalized load for worker [1]
– lbset • worker cluster number [0]
– retry • retry timeout, in seconds, for failed workers [60]
Some tuning params • For workers - connection pool: – min • Initial number of connections [0]
– max • Hard maximum number of connections [1|TPC]
– smax: • soft max - keep this number available [max] • time to live for connections above smax
Some tuning params • For workers - connection pool: – disablereuse: • bypass the connection pool
– ttl • time to live for connections above smax
Some tuning params For workers (cont): – connectiontimeout/timout • Connection timeouts on backend [ProxyTimeout] – flushpackets * • Does proxy need to flush data with each chunk of data? – on : Yes | off : No | auto : wait and see
– flushwait * • ms to wait for data before flushing
Some tuning params For workers (cont): – status (+/-) • D : disabled • S : Stopped • I : Ignore errors • H : Hot standby • E : Error
Some tuning params For balancers: – lbmethod • load balancing algo to use [byrequests] – stickysession • sticky session name (eg: PHPSESSIONID) – maxattempts • failover tries before we bail – Nofailover • Back-ends don't support failover so don't send session when failing over
Recent improvements • ProxyPassMatch – ProxyPass can now take regex’s instead of just “paths” • ProxyPassMatch ^(/.*\.gif)$ http:// backend.example.com$1
– JkMount migration • Or –
ProxyPass ~ ^(/.*\.gif)$ http://backend.example.com$1
• mod_rewrite is balancer aware
Recent improvements • ProxyPassReverse is NOW balancer aware! • The below will work: <Proxy balancer://foo> BalancerMember http://php1:8080/
loadfactor=1
BalancerMember http://php2:8080/
loadfactor=4
</Proxy> ProxyPass /apps/ balancer://foo/ ProxyPassReverse /apps balancer://foo/
Useful Envars • BALANCER_SESSION_STICKY – This is assigned the stickysession value used in the current request. It is the cookie or parameter name used for sticky sessions • BALANCER_SESSION_ROUTE – This is assigned the route parsed from the current request. • BALANCER_NAME – This is assigned the name of the balancer used for the current request. The value is something like balancer://foo.
Useful Envars •
BALANCER_WORKER_NAME – This is assigned the name of the worker used for the current request. The value is something like http://hostA:1234.
•
BALANCER_WORKER_ROUTE – This is assigned the route of the worker that will be used for the current request.
•
BALANCER_ROUTE_CHANGED – This is set to 1 if the session route does not match the worker route (BALANCER_SESSION_ROUTE != BALANCER_WORKER_ROUTE) or the session does not yet have an established route. This can be used to determine when/if the client needs to be sent an updated route when sticky sessions are used.
Putting it all together <Proxy balancer://foo> BalancerMember http://php1:8080/
loadfactor=1
BalancerMember http://php2:8080/
loadfactor=4
BalancerMember http://phpbkup:8080/
loadfactor=4 status=+h
BalancerMember http://phpexp:8080/
lbset=1
ProxySet lbmethod=bytraffic </Proxy> <Proxy balancer://javaapps> BalancerMember ajp://tc1:8089/
loadfactor=1
BalancerMember ajp://tc2:8089/
loadfactor=4
ProxySet lbmethod=byrequests </Proxy>
Putting it all together
ProxyPass /apps/ balancer://foo/ ProxyPass /serv/ balancer://javaapps/ ProxyPass /images/ http://images:8080/
Manipulating HTTP Headers: • Modify HTTP request and response headers – Can be used in Main server, Vhost, Directory, Location, Files sections – Headers can be merged, replaced or removed – Pass on client-specific data to the backend server • IP Address, Request scheme (HTTP, HTTPS), UserAgent, SSL connection info, etc.
Manipulating HTTP Headers: • Shield backend server’s info from the clients – Strip out Server name – Server IP address – etc.
Header examples • Copy all request headers that begin with “TS” to response headers – Header echo ^TS • Say hello to Joe – Header add JoeHeader “Hello Joe!” • If header “MyRequestHeader: value” is present, response will contain “MyHeader” header: – SetEnvIf MyRequestHeader value HAVE_MyRequestHeader – Header add MyHeader “%D %t mytext” env=HAVE_MyRequestHeader
Header examples • Remember, sequence is important! Following will result in “MHeader” to be stipped from the response: – RequestHeader append MyHeader “value1” – RequestHeader append MyHeader “value2” – RequestHeader unset MyHeader
Example: • Pass additional info about Client Browsers to the App Server: ProxyPass / http://backend.covalent.net ProxyPassReverse / http://backend.covalent.net RequestHeader set X-Forwarded-IP %{REMOTE_ADDR}e RequestHeader set X-Request-Scheme %{REQUEST_SCHEME}e
• App Server receives the following HTTP headers: – X-Forwarded-IP: 10.0.0.3 – X-Request-Scheme: https
Using mod-rewrite example # mod_proxy lb example using request parameter RewriteEngine On
# Use mod_rewrite to insert a node name into the url RewriteCond %{QUERY_STRING} accountId=.*([0-2])\b RewriteRule ^/sampleApp/(.*) balancer://tc1/$1 [P]
RewriteCond %{QUERY_STRING} accountId=.*([3-6])\b RewriteRule ^/sampleApp/(.*) balancer://tc2/$1 [P]
RewriteCond %{QUERY_STRING} accountId=.*([7-9])\b RewriteRule ^/sampleApp/(.*) balancer://tc3/$1 [P]
# No ID - round robin to all nodes ProxyPass /sampleApp/ balancer://all/
Using mod-rewrite example <Proxy balancer://tc1> # Default worker for this balancer BalancerMember http://linux6401.dev.local:8080/sampleApp lbset=1
# Backup balancers for node failure - used in round robin # no stickyness BalancerMember http://linux6402.dev.local:8081/sampleApp lbset=1 status=H BalancerMember http://linux6403.dev.local:8081/sampleApp lbset=1 status=H
# Maintenance balancer used to re-route traffic for upgrades etc BalancerMember http://linux6404.dev.local:8080/sampleApp status=D </Proxy>
Using mod-rewrite example <Proxy balancer://tc2> BalancerMember http://linux6402.dev.local:8080/sampleApp lbset=1
# Backup balancers for node failure - used in round robin # no stickyness BalancerMember http://linux6401.dev.local:8081/sampleApp lbset=1 status=H BalancerMember http://linux6403.dev.local:8081/sampleApp lbset=1 status=H
# Maintenance balancer used to re-route traffic for upgrades etc BalancerMember http://linux6404.dev.local:8080/sampleApp status=D </Proxy>
Using mod-rewrite example <Proxy balancer://tc3> BalancerMember http://linux6403.dev.local:8080/sampleApp lbset=1 # Backup balancers for node failure - used in round robin # no stickyness BalancerMember http://linux6401.dev.local:8081/sampleApp lbset=1 status=H BalancerMember http://linux6402.dev.local:8081/sampleApp lbset=1 status=H
# Maintenance balancer used to re-route traffic for upgrades etc BalancerMember http://linux6404.dev.local:8080/sampleApp status=D </Proxy>
Using mod-rewrite example <Proxy balancer://all> BalancerMember http://linux6401:8080/sampleApp BalancerMember http://linux6402:8080/sampleApp BalancerMember http://linux6403:8080/sampleApp </Proxy>
<Location /balancer-manager> SetHandler balancer-manager Order deny,allow Deny from all Allow from .dev.local </Location>
What’s on the horizon? • Improving AJP • Adding additional protocols • mass_vhost like clusters/proxies • More dynamic configuration
Open Source: Itâ&#x20AC;&#x2122;s just not for IT anymore! Jim Jagielski
Agenda Introduction What is the ASF What exactly is “Open Source” The Lessons Learned by the ASF
Introduction Jim Jagielski Longest still-active developer/contributor Co-founder of the ASF Member, Director and President
The ASF ASF == The Apache Software Foundation Before the ASF there was “The Apache Group” The ASF was incorporated in 1999
The ASF Non-profit corporation founded in 1999 501( c )3 charity Volunteer organization Virtual world-wide organization Exists to provide the organizational, legal, and financial support for various OSS projects
The ASFâ&#x20AC;&#x2122;s Mission Provide open source software to the public free of charge Provide a foundation for open, collaborative software development projects by supplying hardware, communication, and business infrastructure Create an independent legal entity to which companies and individuals can donate resources and be assured that those resources will be used for the public benefit
The ASF’s Mission Provide a means for individual volunteers to be sheltered from legal suits directed at the Foundation’s projects Protect the ‘Apache’ brand, as applied to its software products, from being abused by other organizations Provide legal and technical infrastructure for open source software development and to perform appropriate oversight of
How We Work The Apache Software Foundation provides support for the Apache community of open-source software projects. The Apache projects are characterized by a collaborative, consensus based development process, an open and pragmatic software license, and a desire to create high quality software that leads the way in its field. We consider ourselves not simply a group of projects sharing a server, but rather a community of developers and users.
How We Work, Take 2 Community created code Our code should be exceptional
Structure of the ASF - dev Volunteer Driven Organization Software Projects are managed by Project Management Committees (PMCs) PMCs vote in new PMC members and committers At the end of the day: People / Individual focused
Structure of the ASF - legal Member-based corporation - individuals only Members nominate and elect new members Members elect a board - 9 seats Semi-annual meetings via IRC Each PMC has a Chair - eyes and ears of the board (oversight only)
ASF â&#x20AC;&#x153;Org Chartâ&#x20AC;? Development
Administrative
PMC Members Committers
Members
Contributors
Officers
Patchers/Buggers
Board
Users
Issues with Dual Stacks Despite clear differentiation, sometimes there are leaks eg: PMC chair seen as “lead” developer Sometimes officers are assumed to have too much power if they venture into development issues “hats”
Why Open Source? Access to the source code Avoid vendor lock-in (or worse!) Much better software Better security record (more eyes) Much more nimble development - frequent releases Direct user input
The draw of Open Source Having a real impact in the development and direction of IT Personal satisfaction: I wrote that! Sense of membership in a community Sense of accomplishment - very quick turnaround times Developers and engineers love to tinker - huge opportunity to do so
Open Source FUD No quality or quality control Prevents or slows development Have to â&#x20AC;&#x153;give it away for freeâ&#x20AC;? No real innovation
What is Open Source? Open Source Licensing OSI Approved Free Software As in Free Speech, not Free Beer
What is Open Source? Basically, it’s a “new” way to develop, license and distribute code Actually, there was “open source” even before it was called that The key technologies behind the Internet and the Web are all Open Source based
True Open Source For software to be Open Source, it must be under an OSI approved Open Source License At last count, over 70 exist
Open Source Licenses Give Me Credit AL, BSD, MIT Give Me Fixes (L)GPL, EPL, MPL Give Me Everything GPL - Dave Johnson http://rollerweblogger.org/page/roller?entry=gimme_credit_gimme_fixes_gimmem
The Apache License (AL) A liberal open source software license - BSD-like Business friendly Requires attribution Includes Patent Grant Easily reused by other projects & organizations
Give Me Fixes MPL / EPL / (L)GPL Used mostly with platforms or libraries Protects the licensed code, but allows larger derivative works with different licensing Still very business friendly
Give Me Everything GPL (copyleft) Derivative works also under GPL Linked works could also be under GPL Viral nature may likely limit adoption GPL trumps all others or else incompatible
License Differences Mainly involve the licensing of derivative works Only really applies during (re)distribution of work Where the â&#x20AC;&#x153;freedomâ&#x20AC;? should be mostly focused: the user or the code itself
One True License There is no such thing Licensing is selected to address what you are trying to do In general, Open Standards do better with AL-like license
The Apache Way Although the term is deprecated, â&#x20AC;&#x153;The Apache Wayâ&#x20AC;? relates to how the ASF (and its projects) work and operate Basically, the least common denominators on how PMCs operate
Basic Memes Meritocracy Peer-based Consensus decision making Collaborative development Responsible oversight
Meritocracy â&#x20AC;&#x153;Govern by Meritâ&#x20AC;? Merit is based on what you do Merit never expires Those with merit, get more responsibility
Peer-based Developers represent themselves - individuals Mutual trust and respect All votes hold the same weight Community created code Healthy communities create healthy code Poisonous communities donâ&#x20AC;&#x2122;t
Look Familiar? These concepts are not new or unique Best practices regarding how the Scientific and Health community works
Publish or Perish In Open Source, frequent releases indicate healthy activity What is collaborative s/w development other than peer review? Think how restrictive research would be w/o open communication
Why Community -> Code Since we are all volunteers, people’s time and interests change A healthy community is “warm and inviting” and encourages a continued influx of developers Poisonous people/communities turn people off, and the project will die End result - better code, long-term code
Consensus decision making Key is the idea of voting +1 - yes +0 - no real comment -1 - veto Sometimes youâ&#x20AC;&#x2122;ll also see stuff like -0, -0.5, etc...
Voting The main intent is to gauge developer acceptance Vetos must be justifiable and have sound technical merit If valid, Vetos cannot be overruled Vetos are very rare
Commit Process Review Then Commit (RTC) A patch is submitted to the project for inclusion If at least 3 +1s and no -1s, code is committed Good for stable branches Ensures enough â&#x20AC;&#x153;eyes on the codeâ&#x20AC;? on a direct-to-release path
Commit Process Commit Then Review (CTR) A patch is committed directly to the code Review Process happens post commit Good for development branches Depends on people doing reviews after the fact Allows very fast development
Commit Process Lazy Consensus variant of RTC “I plan on committing this in 3 days” Provides opportunity for oversight, but with known “deadline” As always, can be vetoed after the fact
Collaborative Development Code is developed by the community Voting ensures at least 3 active developers Development done online and on-list If it didnâ&#x20AC;&#x2122;t happen on-list, it didnâ&#x20AC;&#x2122;t happen
Collaborative Development Mailing lists are the preferred method Archived Asynchronous Available to anyone - public list
Collaborative Development Other methods are OK, if not primary Wikis IRC F2F Always bring back to the list
Responsible Oversight Ensure license compliance Track IP Quality code Quality community
The Apache Incubator Entry point for all new projects and codebases Indoctrinates the Apache Way to the podling Ensures and tracks IP
Contributor License Agreement aka: iCLA (for individual) Required of all committers Guarantees: The person has the authority to commit the code That the ASF can relicense the code Does NOT assign copyright
Success Stories - HTTPD Apache HTTP Server (â&#x20AC;&#x153;Apacheâ&#x20AC;?) Reference implementation of HTTP Most popular web server in existance Found in numerous commercial web servers Oracle, IBM,... Influenced countless more
Success Stories - HTTPD By having a “free” and open source reference implementation, the drive to create a separate proprietary version was reduced. “Why spend time and money, when we can use this” This allowed HTTP (and the Web) to grow and STAY usable (compare to the old browser wars)
Success Stories - Tomcat Apache Tomcat (Servlet Container) The default standard servlet container Each version maps to a specific spec. Bundled with numerous Java apps out there Likely a major influence on the diminishing relevance of JEE
Success Stories - Others Apache Geronimo (Server Framework - JEE ) Not â&#x20AC;&#x153;justâ&#x20AC;? a JEE server Apache Maven (Java project build/management tool) Another industry standard Apache Logging, Axis, Struts, ...
Internal Advantages Much tighter community better communication (esp. with “users”) tighter feedback loop Ready made audience Don’t need to “sell” the aspects of Open Source which scare/ confuse IT people
Governance Gotchas
Trust and Merit are earned, which implies time The available pool might be relatively small
Some Concluding Thoughts Trust your developers AND your users Communication is key Open Source is NOT the Good Housekeeping Seal Of Approval But donâ&#x20AC;&#x2122;t believe in all the FUD either Success is not measured in market share, but in adoption
Some Concluding Thoughts Open Source should have a viable business or emotional reason Give some thought to licensing Make it easier for developers and users to â&#x20AC;&#x153;joinâ&#x20AC;? Give them a reason to
Helpful links The Apache Software Foundation www.apache.org Want to help a great organization? www.marylandstateboychoir.org
Thatâ&#x20AC;&#x2122;s It Thank you! Any questions?