/Apache-CloudandFOSSlessonslearned-combined

Page 1

Apache httpd v2.4: Hello Cloud Jim Jagielski


What we will cover • Performance Related Enhancements • Reverse Proxy Server Enhancements


Apache httpd 2.4 Currently in beta release Expected GA: This May! Significant Improvements high-performance cloud suitability


Apache httpd 2.4 Support for async I/O w/o dropping support for older systems Larger selection of usable MPMs: added Event, Simple, etc... Leverages higher-performant versions of APR


Apache httpd 2.4 Bandwidth control now standard Finer control of timeouts, esp. during requests Controllable buffering of I/O Support for Lua


Apache httpd 2.4 Reverse Proxy Improvements Supports FastCGI, SCGI Additional load balancing mechanisms Runtime changing of clusters w/o restarts Support for dynamic configuration


mod_proxy • • • •

An Apache module Implements core proxy capability Both forward and reverse proxy In general, most people use it for reverse proxy (gateway) functionality


How did we get here? • A stroll down mod_proxy lane – First available in Apache 1.1 • “Experimental Caching Proxy Server” – In Apache 1.2, pretty stable, but just HTTP/1.0 – In Apache 1.3, much improved with added support for HTTP/1.1 – In Apache 2.0, break out cache and proxy – In Apache 2.2, lay framework


Proxy Improvements • Becoming a robust but generic proxy implementation • Support various protocols – HTTP, HTTPS, CONNECT, FTP – AJP, FastCGI, SCGI, WSGI (soon) – Load balancing • Clustering, failover


AJP? Really? • Yep, Apache can now talk AJP with Tomcat directly • mod_proxy_ajp is the magic mojo • Other proxy improvements make this even more exciting • mod_jk alternative


But I like mod_jk • That’s fine, but... – Now the config is much easier and more consistent ProxyPass /servlets 8089

ajp://tc.example.com:

– Easier when Apache needs to proxy both HTTP and AJP – Leverage improvements in proxy module


Features of Proxy Server • Performance • Monitoring • Filtering • Caching (with mod_cache)


Reverse Proxy • Operated at the server end of the transaction • Completely transparent to the Web Browser – thinks the Reverse Proxy Server is the real server Reverse Proxy Server

Cloud

Internet Browser

Firewall

Firewall

Transactional Servers


Features of Reverse Proxy • Security – Uniform security policy can be administered – The real transactional servers are behind the firewall • Delegation, Specialization, Load Balancing


Configuring Reverse Proxy • Set ProxyRequests Off • Apply ProxyPass, ProxyPassReverse and possibly RewriteRule directives


Reverse Proxy Directives: • Allows remote server to be mapped into the space of the local (Reverse Proxy) server • Example: – ProxyPass /secure/ http://secureserver/

– Presumably “secureserver” is inaccessible directly from the internet


Reverse Proxy Directives: • Used to specify that redirects issued by the remote server are to be translated to use the proxy before being returned to the client. • Syntax is identical to ProxyPass; used in conjunction with it • Example: – ProxyPass /secure/ http://secureserver/


Simple Rev Proxy • All requests for /images to a backend server •

ProxyPass /images http://images.example.com/

ProxyPass <path> <scheme>://<full url>

• Useful, but limited • What if: – images.example.com dies? – traffic for /images increases


Baby got back • We need more backend servers • And balance the load between them • Before 2.2, mod_rewrite was your only option • Some people would prefer spending an evening with an Life Insurance salesman rather than deal with mod_rewrite


Load Balancer • mod_proxy_balancer.so • mod_proxy can do native load balancing – weight by actual requests – weight by traffic – weight by busyness – lbfactors


Load Balancer • LB algorithms are implemented as providers – easy to add – no core code changes required – growing list of methods


Load Balancer • Backend connection pooling • Available for named workers: – eg: ProxyPass /foo http://bar.example.com • Reusable connection to origin – For threaded MPMs, can adjust size of pool (min, max, smax) – For prefork: singleton • Shared data held in shared memory


Pooling example <Proxy balancer://foo> BalancerMember http://www1.example.com:80/

loadfactor=1

BalancerMember http://www2.example.com:80/

loadfactor=1

BalancerMember http://www3.example.com:80/ status=+h

loadfactor=4

ProxySet lbmethod=bytraffic </Proxy>


Load Balancer • Sticky session support – aka “session affinity” • Cookie based – stickysession=PHPSESSID – stickysession=JSESSIONID • Natively easy with Tomcat • May require more setup for “simple” HTTP proxying


Load Balancer • Cluster set with failover • Group backend servers as numbered sets – balancer will try lower-valued sets first – If no workers are available, will try next set • Hot standby


Example <Proxy balancer://foo> BalancerMember http://php1:8080/

loadfactor=1

BalancerMember http://php2:8080/

loadfactor=4

BalancerMember http://phpbkup:8080/ loadfactor=4 status=+h BalancerMember http://offsite1:8080/ lbset=1 BalancerMember http://offsite2:8080/ lbset=1 ProxySet lbmethod=bytraffic </Proxy> ProxyPass /apps/ balancer://foo/


Embedded Admin • Allows for real-time – Monitoring of stats for each worker – Adjustment of worker params • lbset • load factor • route • enabled / disabled • ...


Embedded Admin • Allows for real-time • Addition of new workers/nodes • Change of LB methods • Can be persistent • More RESTful • Can be CLI-driven


Easy setup <Location /balancer-manager> SetHandler balancer-manager Order Deny,Allow Deny from all Allow from 192.168.2.22 </Location>


Admin


Admin


Admin

Changing the LBmethod Adding new worker


Admin


Some tuning params • For workers: – loadfactor • normalized load for worker [1]

– lbset • worker cluster number [0]

– retry • retry timeout, in seconds, for failed workers [60]


Some tuning params • For workers - connection pool: – min • Initial number of connections [0]

– max • Hard maximum number of connections [1|TPC]

– smax: • soft max - keep this number available [max] • time to live for connections above smax


Some tuning params • For workers - connection pool: – disablereuse: • bypass the connection pool

– ttl • time to live for connections above smax


Some tuning params For workers (cont): – connectiontimeout/timout • Connection timeouts on backend [ProxyTimeout] – flushpackets * • Does proxy need to flush data with each chunk of data? – on : Yes | off : No | auto : wait and see

– flushwait * • ms to wait for data before flushing


Some tuning params For workers (cont): – status (+/-) • D : disabled • S : Stopped • I : Ignore errors • H : Hot standby • E : Error


Some tuning params For balancers: – lbmethod • load balancing algo to use [byrequests] – stickysession • sticky session name (eg: PHPSESSIONID) – maxattempts • failover tries before we bail – Nofailover • Back-ends don't support failover so don't send session when failing over


Recent improvements • ProxyPassMatch – ProxyPass can now take regex’s instead of just “paths” • ProxyPassMatch ^(/.*\.gif)$ http:// backend.example.com$1

– JkMount migration • Or –

ProxyPass ~ ^(/.*\.gif)$ http://backend.example.com$1

• mod_rewrite is balancer aware


Recent improvements • ProxyPassReverse is NOW balancer aware! • The below will work: <Proxy balancer://foo> BalancerMember http://php1:8080/

loadfactor=1

BalancerMember http://php2:8080/

loadfactor=4

</Proxy> ProxyPass /apps/ balancer://foo/ ProxyPassReverse /apps balancer://foo/


Useful Envars • BALANCER_SESSION_STICKY – This is assigned the stickysession value used in the current request. It is the cookie or parameter name used for sticky sessions • BALANCER_SESSION_ROUTE – This is assigned the route parsed from the current request. • BALANCER_NAME – This is assigned the name of the balancer used for the current request. The value is something like balancer://foo.


Useful Envars •

BALANCER_WORKER_NAME – This is assigned the name of the worker used for the current request. The value is something like http://hostA:1234.

BALANCER_WORKER_ROUTE – This is assigned the route of the worker that will be used for the current request.

BALANCER_ROUTE_CHANGED – This is set to 1 if the session route does not match the worker route (BALANCER_SESSION_ROUTE != BALANCER_WORKER_ROUTE) or the session does not yet have an established route. This can be used to determine when/if the client needs to be sent an updated route when sticky sessions are used.


Putting it all together <Proxy balancer://foo> BalancerMember http://php1:8080/

loadfactor=1

BalancerMember http://php2:8080/

loadfactor=4

BalancerMember http://phpbkup:8080/

loadfactor=4 status=+h

BalancerMember http://phpexp:8080/

lbset=1

ProxySet lbmethod=bytraffic </Proxy> <Proxy balancer://javaapps> BalancerMember ajp://tc1:8089/

loadfactor=1

BalancerMember ajp://tc2:8089/

loadfactor=4

ProxySet lbmethod=byrequests </Proxy>


Putting it all together

ProxyPass /apps/ balancer://foo/ ProxyPass /serv/ balancer://javaapps/ ProxyPass /images/ http://images:8080/


Manipulating HTTP Headers: • Modify HTTP request and response headers – Can be used in Main server, Vhost, Directory, Location, Files sections – Headers can be merged, replaced or removed – Pass on client-specific data to the backend server • IP Address, Request scheme (HTTP, HTTPS), UserAgent, SSL connection info, etc.


Manipulating HTTP Headers: • Shield backend server’s info from the clients – Strip out Server name – Server IP address – etc.


Header examples • Copy all request headers that begin with “TS” to response headers – Header echo ^TS • Say hello to Joe – Header add JoeHeader “Hello Joe!” • If header “MyRequestHeader: value” is present, response will contain “MyHeader” header: – SetEnvIf MyRequestHeader value HAVE_MyRequestHeader – Header add MyHeader “%D %t mytext” env=HAVE_MyRequestHeader


Header examples • Remember, sequence is important! Following will result in “MHeader” to be stipped from the response: – RequestHeader append MyHeader “value1” – RequestHeader append MyHeader “value2” – RequestHeader unset MyHeader


Example: • Pass additional info about Client Browsers to the App Server: ProxyPass / http://backend.covalent.net ProxyPassReverse / http://backend.covalent.net RequestHeader set X-Forwarded-IP %{REMOTE_ADDR}e RequestHeader set X-Request-Scheme %{REQUEST_SCHEME}e

• App Server receives the following HTTP headers: – X-Forwarded-IP: 10.0.0.3 – X-Request-Scheme: https


Using mod-rewrite example # mod_proxy lb example using request parameter RewriteEngine On

# Use mod_rewrite to insert a node name into the url RewriteCond %{QUERY_STRING} accountId=.*([0-2])\b RewriteRule ^/sampleApp/(.*) balancer://tc1/$1 [P]

RewriteCond %{QUERY_STRING} accountId=.*([3-6])\b RewriteRule ^/sampleApp/(.*) balancer://tc2/$1 [P]

RewriteCond %{QUERY_STRING} accountId=.*([7-9])\b RewriteRule ^/sampleApp/(.*) balancer://tc3/$1 [P]

# No ID - round robin to all nodes ProxyPass /sampleApp/ balancer://all/


Using mod-rewrite example <Proxy balancer://tc1> # Default worker for this balancer BalancerMember http://linux6401.dev.local:8080/sampleApp lbset=1

# Backup balancers for node failure - used in round robin # no stickyness BalancerMember http://linux6402.dev.local:8081/sampleApp lbset=1 status=H BalancerMember http://linux6403.dev.local:8081/sampleApp lbset=1 status=H

# Maintenance balancer used to re-route traffic for upgrades etc BalancerMember http://linux6404.dev.local:8080/sampleApp status=D </Proxy>


Using mod-rewrite example <Proxy balancer://tc2> BalancerMember http://linux6402.dev.local:8080/sampleApp lbset=1

# Backup balancers for node failure - used in round robin # no stickyness BalancerMember http://linux6401.dev.local:8081/sampleApp lbset=1 status=H BalancerMember http://linux6403.dev.local:8081/sampleApp lbset=1 status=H

# Maintenance balancer used to re-route traffic for upgrades etc BalancerMember http://linux6404.dev.local:8080/sampleApp status=D </Proxy>


Using mod-rewrite example <Proxy balancer://tc3> BalancerMember http://linux6403.dev.local:8080/sampleApp lbset=1 # Backup balancers for node failure - used in round robin # no stickyness BalancerMember http://linux6401.dev.local:8081/sampleApp lbset=1 status=H BalancerMember http://linux6402.dev.local:8081/sampleApp lbset=1 status=H

# Maintenance balancer used to re-route traffic for upgrades etc BalancerMember http://linux6404.dev.local:8080/sampleApp status=D </Proxy>


Using mod-rewrite example <Proxy balancer://all> BalancerMember http://linux6401:8080/sampleApp BalancerMember http://linux6402:8080/sampleApp BalancerMember http://linux6403:8080/sampleApp </Proxy>

<Location /balancer-manager> SetHandler balancer-manager Order deny,allow Deny from all Allow from .dev.local </Location>


What’s on the horizon? • Improving AJP • Adding additional protocols • mass_vhost like clusters/proxies • More dynamic configuration


Open Source: It’s just not for IT anymore! Jim Jagielski


Agenda Introduction What is the ASF What exactly is “Open Source” The Lessons Learned by the ASF


Introduction Jim Jagielski Longest still-active developer/contributor Co-founder of the ASF Member, Director and President


The ASF ASF == The Apache Software Foundation Before the ASF there was “The Apache Group” The ASF was incorporated in 1999


The ASF Non-profit corporation founded in 1999 501( c )3 charity Volunteer organization Virtual world-wide organization Exists to provide the organizational, legal, and financial support for various OSS projects


The ASF’s Mission Provide open source software to the public free of charge Provide a foundation for open, collaborative software development projects by supplying hardware, communication, and business infrastructure Create an independent legal entity to which companies and individuals can donate resources and be assured that those resources will be used for the public benefit


The ASF’s Mission Provide a means for individual volunteers to be sheltered from legal suits directed at the Foundation’s projects Protect the ‘Apache’ brand, as applied to its software products, from being abused by other organizations Provide legal and technical infrastructure for open source software development and to perform appropriate oversight of


How We Work The Apache Software Foundation provides support for the Apache community of open-source software projects. The Apache projects are characterized by a collaborative, consensus based development process, an open and pragmatic software license, and a desire to create high quality software that leads the way in its field. We consider ourselves not simply a group of projects sharing a server, but rather a community of developers and users.


How We Work, Take 2 Community created code Our code should be exceptional


Structure of the ASF - dev Volunteer Driven Organization Software Projects are managed by Project Management Committees (PMCs) PMCs vote in new PMC members and committers At the end of the day: People / Individual focused


Structure of the ASF - legal Member-based corporation - individuals only Members nominate and elect new members Members elect a board - 9 seats Semi-annual meetings via IRC Each PMC has a Chair - eyes and ears of the board (oversight only)


ASF “Org Chart� Development

Administrative

PMC Members Committers

Members

Contributors

Officers

Patchers/Buggers

Board

Users


Issues with Dual Stacks Despite clear differentiation, sometimes there are leaks eg: PMC chair seen as “lead” developer Sometimes officers are assumed to have too much power if they venture into development issues “hats”


Why Open Source? Access to the source code Avoid vendor lock-in (or worse!) Much better software Better security record (more eyes) Much more nimble development - frequent releases Direct user input


The draw of Open Source Having a real impact in the development and direction of IT Personal satisfaction: I wrote that! Sense of membership in a community Sense of accomplishment - very quick turnaround times Developers and engineers love to tinker - huge opportunity to do so


Open Source FUD No quality or quality control Prevents or slows development Have to “give it away for free� No real innovation


What is Open Source? Open Source Licensing OSI Approved Free Software As in Free Speech, not Free Beer


What is Open Source? Basically, it’s a “new” way to develop, license and distribute code Actually, there was “open source” even before it was called that The key technologies behind the Internet and the Web are all Open Source based


True Open Source For software to be Open Source, it must be under an OSI approved Open Source License At last count, over 70 exist


Open Source Licenses Give Me Credit AL, BSD, MIT Give Me Fixes (L)GPL, EPL, MPL Give Me Everything GPL - Dave Johnson http://rollerweblogger.org/page/roller?entry=gimme_credit_gimme_fixes_gimmem


The Apache License (AL) A liberal open source software license - BSD-like Business friendly Requires attribution Includes Patent Grant Easily reused by other projects & organizations


Give Me Fixes MPL / EPL / (L)GPL Used mostly with platforms or libraries Protects the licensed code, but allows larger derivative works with different licensing Still very business friendly


Give Me Everything GPL (copyleft) Derivative works also under GPL Linked works could also be under GPL Viral nature may likely limit adoption GPL trumps all others or else incompatible


License Differences Mainly involve the licensing of derivative works Only really applies during (re)distribution of work Where the “freedom� should be mostly focused: the user or the code itself


One True License There is no such thing Licensing is selected to address what you are trying to do In general, Open Standards do better with AL-like license


The Apache Way Although the term is deprecated, “The Apache Way� relates to how the ASF (and its projects) work and operate Basically, the least common denominators on how PMCs operate


Basic Memes Meritocracy Peer-based Consensus decision making Collaborative development Responsible oversight


Meritocracy “Govern by Merit� Merit is based on what you do Merit never expires Those with merit, get more responsibility


Peer-based Developers represent themselves - individuals Mutual trust and respect All votes hold the same weight Community created code Healthy communities create healthy code Poisonous communities don’t


Look Familiar? These concepts are not new or unique Best practices regarding how the Scientific and Health community works


Publish or Perish In Open Source, frequent releases indicate healthy activity What is collaborative s/w development other than peer review? Think how restrictive research would be w/o open communication


Why Community -> Code Since we are all volunteers, people’s time and interests change A healthy community is “warm and inviting” and encourages a continued influx of developers Poisonous people/communities turn people off, and the project will die End result - better code, long-term code


Consensus decision making Key is the idea of voting +1 - yes +0 - no real comment -1 - veto Sometimes you’ll also see stuff like -0, -0.5, etc...


Voting The main intent is to gauge developer acceptance Vetos must be justifiable and have sound technical merit If valid, Vetos cannot be overruled Vetos are very rare


Commit Process Review Then Commit (RTC) A patch is submitted to the project for inclusion If at least 3 +1s and no -1s, code is committed Good for stable branches Ensures enough “eyes on the code� on a direct-to-release path


Commit Process Commit Then Review (CTR) A patch is committed directly to the code Review Process happens post commit Good for development branches Depends on people doing reviews after the fact Allows very fast development


Commit Process Lazy Consensus variant of RTC “I plan on committing this in 3 days” Provides opportunity for oversight, but with known “deadline” As always, can be vetoed after the fact


Collaborative Development Code is developed by the community Voting ensures at least 3 active developers Development done online and on-list If it didn’t happen on-list, it didn’t happen


Collaborative Development Mailing lists are the preferred method Archived Asynchronous Available to anyone - public list


Collaborative Development Other methods are OK, if not primary Wikis IRC F2F Always bring back to the list


Responsible Oversight Ensure license compliance Track IP Quality code Quality community


The Apache Incubator Entry point for all new projects and codebases Indoctrinates the Apache Way to the podling Ensures and tracks IP


Contributor License Agreement aka: iCLA (for individual) Required of all committers Guarantees: The person has the authority to commit the code That the ASF can relicense the code Does NOT assign copyright


Success Stories - HTTPD Apache HTTP Server (“Apache�) Reference implementation of HTTP Most popular web server in existance Found in numerous commercial web servers Oracle, IBM,... Influenced countless more


Success Stories - HTTPD By having a “free” and open source reference implementation, the drive to create a separate proprietary version was reduced. “Why spend time and money, when we can use this” This allowed HTTP (and the Web) to grow and STAY usable (compare to the old browser wars)


Success Stories - Tomcat Apache Tomcat (Servlet Container) The default standard servlet container Each version maps to a specific spec. Bundled with numerous Java apps out there Likely a major influence on the diminishing relevance of JEE


Success Stories - Others Apache Geronimo (Server Framework - JEE ) Not “just� a JEE server Apache Maven (Java project build/management tool) Another industry standard Apache Logging, Axis, Struts, ...


Internal Advantages Much tighter community better communication (esp. with “users”) tighter feedback loop Ready made audience Don’t need to “sell” the aspects of Open Source which scare/ confuse IT people


Governance Gotchas

Trust and Merit are earned, which implies time The available pool might be relatively small


Some Concluding Thoughts Trust your developers AND your users Communication is key Open Source is NOT the Good Housekeeping Seal Of Approval But don’t believe in all the FUD either Success is not measured in market share, but in adoption


Some Concluding Thoughts Open Source should have a viable business or emotional reason Give some thought to licensing Make it easier for developers and users to “join� Give them a reason to


Helpful links The Apache Software Foundation www.apache.org Want to help a great organization? www.marylandstateboychoir.org


That’s It Thank you! Any questions?


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.